Being part of a bigger idea

by mandel on August 30th, 2009

Since I read the comment from Statik in the post I talked about the port to CouchDb of macaco-contacts I have been working to try and align by serialization to that proposed by the desktopcouch project in freedesktop.org. I really believe that I should be making the effort to try and work with their project as much as possible, specially because I love their idea and also because the more we are working in the same direction the better. To begging with one, of the great things of using desktopcouch’s record is that their plug-in for evolution and macaco-contacts will use the same storage which will mean sync between evolution contacts and macaco-contacts, yeah!!

Changes in macaco-contacts

So far all I have managed to make all the required changed to store the data of macaco contacts using the desktopcouh project, therefore contacts right now look like this in CouchDB:

{
    'first_name': 'Manuel', 
    'last_name': 'de la Pena',
    'addresses': {
        '21e62807c53147be9f61bca1fba4550f': {
            'city': 'Madrid', 
            'description': 'Work Local', 
            'country': 'Spain', 
            'state': 'Madrid', 
            'address1': 'C/Caracas 19 6 drch', 
            'postalcode': '28010'
         }, 
         'b50d84754b814614a42ba24a3d365fd6': {
            'city': 'Brussels', 
            'description': 'Work Local', 
            'country': 'Belgium', 
            'state': 'Brussels', 
            'address1': 'Rue du Prevot', 
            'postalcode': '1050'}
        }, 
    'application_annotations': {
        'macaco-contacts': {
            'middle_name': '',
            'social_accounts': {}, 
            'title': 'Mr.', 
            'comments': '', 
            'print_mode': 0, 
            'webs': {}, 
            'ims': {}, 
            'preferred_address': {
                'city': 'Madrid', 
                'description': 'Work Local', 
                'country': 'Spain', 
                'state': 'Madrid', 
                'address1': 'C/Caracas 19 6 drch', 
                'postalcode': '28010', 
                '_id': '21e62807c53147be9f61bca1fba4550f'
            }, 
            'logo': '', 
            'type': 'Person'
         }
     }, 
     'record_type': 'http://www.freedesktop.org/wiki/Specifications/desktopcouch/contact', 
     'birth_date': '30-8-2009', 
     'phone_numbers': {
         '53eb30d06e4148b7bc119dce014fe320': {
             'number': '9231933', 
             'description': 'work'
         }, 
         'e802351ce1bc4442843bb24a105b8dd2': {
             'number': '22333', 
             'description': 'work'
         }
      }, 
      '_id': '12247bc5d85e4d77933e7826ae95c7f9', 
      'email_addresses': {
          'dd02ef6675b642508e14187ad40d8519': {
              'description': 'Home Internet', 
              'address': 'etil15@gmail.com'
           }, 
           '645eb9a85cfb43a8b823f51176e46e60': {
               'description': 'Home Internet', 
               'address': 'mandel@themacaque.com'
           }
     }
}

Now, although I have made the changes, I do have some small complains related to the record definition:

application_annotations

If I understood correctly, which I hope I did, the application_annotations will be a dictionary that would/should contain the specific data per application where each application will use their name as the key. Although I kind of understand where the team was going for this I really believe that they have let their thinking be polluted by relational databases.

As probably most of you know, CouchDB uses documents and in documents relations are stored in dictionaries/collections (or at least is one of the ways to do it). In this case the application_annotations seems to try to represents a relation between a contact and the specific data of that contact in an application… Adding extra information related to an application should be done by adding new properties to the JSON serialization rather than adding nested dictionaries. But before I say what I would have done, let me explain my concerns:

Moving data around
With this record definition the application developers like me should make sure that they just get their data in the application. My concern with the approach is the fact that the creation of views will be more complicated that normally. If an application wants just their data they will have to perform a view similar to the following:

function(doc) {
    if (doc.first_name && doc.last_name 
        && doc.application_annotations) {
        // we want to emit the name and last name as 
        // well as just the data of macaco-contacts 
       macaco_data = doc.application_annotations["macaco-contacts"]
       emit(doc.first_name, doc.last_name, macaco_data)
    }
}

The above function is not as complicated as it could be because it just emits the dictionary with macaco-contacts data, but what about sending a plain row? That will certainly be a lot harder! Do we expect application developers to make complicated views… I’m sure that the prefer to focus on their app that on doing this kind of house keeping… In this case the developers must make a choice between spending time defining complicate yet useful views or simply carry around the data used by other applications… If the developer decides to use his own views (obviously the correct way) we have the second described problem.

Corrupting other applications data or performance impact when updating
In this case we are going to assume that in my application we are going to just work with a document that contains only my data. In this case we might write something like this:

# get the data from the view using a db (python-couchdb lib)
name = 'macaco-contacts/contacts'
rows = db.view(name)
for current in rows:
    current.first_name = 'test'
    db[current._id] = current

Where we are going to assume that the rows are documents with the following data (notice there is not annotations from any other application!!!):

{
    'first_name': 'Manuel', 
    'last_name': 'de la Pena',
    'addresses': {
        '21e62807c53147be9f61bca1fba4550f': {
            'city': 'Madrid', 
            'description': 'Work Local', 
            'country': 'Spain', 
            'state': 'Madrid', 
            'address1': 'C/Caracas 19 6 drch', 
            'postalcode': '28010'
         }, 
         'b50d84754b814614a42ba24a3d365fd6': {
            'city': 'Brussels', 
            'description': 'Work Local', 
            'country': 'Belgium', 
            'state': 'Brussels', 
            'address1': 'Rue du Prevot', 
            'postalcode': '1050'}
        }, 
    'application_annotations': {
        'macaco-contacts': {
            'middle_name': '',
            'social_accounts': {}, 
            'title': 'Mr.', 
            'comments': '', 
            'print_mode': 0, 
            'webs': {}, 
            'ims': {}, 
            'preferred_address': {
                'city': 'Madrid', 
                'description': 'Work Local', 
                'country': 'Spain', 
                'state': 'Madrid', 
                'address1': 'C/Caracas 19 6 drch', 
                'postalcode': '28010', 
                '_id': '21e62807c53147be9f61bca1fba4550f'
            }, 
            'logo': '', 
            'type': 'Person'
         }
     }, 
     'record_type': 'http://www.freedesktop.org/wiki/Specifications/desktopcouch/contact', 
     'birth_date': '30-8-2009', 
     'phone_numbers': {
         '53eb30d06e4148b7bc119dce014fe320': {
             'number': '9231933', 
             'description': 'work'
         }, 
         'e802351ce1bc4442843bb24a105b8dd2': {
             'number': '22333', 
             'description': 'work'
         }
      }, 
      '_id': '12247bc5d85e4d77933e7826ae95c7f9', 
      'email_addresses': {
          'dd02ef6675b642508e14187ad40d8519': {
              'description': 'Home Internet', 
              'address': 'etil15@gmail.com'
           }, 
           '645eb9a85cfb43a8b823f51176e46e60': {
               'description': 'Home Internet', 
               'address': 'mandel@themacaque.com'
           }
     }
}

Can you find the HUGE bug of the python code? The code is seflexplanatory but lets look at it line by line:

name = 'macaco-contacts/contacts'
rows = db.view(name)
# loop throw all the documetns
for current in rows:
    # update the name property
    current.first_name = 'test'
   # update the document with the same if as current
   # the bug is here
    db[current._id] = current

Because the developer is just working with his data, when he performs an update he is going to remove ALL data from OTHER applications (I can see some people getting pissed off already). In this case a developer has to perform a GET to get all the data he is not interested in, add it to the new dock and finally perform the PUT. Why that extra GET, because we do not have the data, of course we wanted to carry all the data around in our app (bigger memory foot print)… Do we really want to have this type of code?!

My point of view is to remove this extra nested dictionaries and let applications add any extra fields they want. CouchDb has a great view management system where the data is cached and the view are not recomputed unless there is an update that affects them. In a document based data base you should not worry about the size of the documents but about the size of the data you are working on… Let the guys in CouchDB worry about dealing with large documents and forget about any type of relations!!!!

No companies record

There is not recommendation regarding the storage of companies so for the time being I’m simply using something similar to the cocntact recommended record plus the missing data required for a company. For the time being this should do the trick :P

Description

The vcard format talks about the different types of phone numbers, addresses and emails as well as the location of addresses (Local and International) but the record just talks about description. There is not regexp defined to how to get the data from the description field. Currently I just concatenate with type and location in the string. A better more clear definition would be very nice.

Birth date

This is just a stupid complain, but is it European format or American?

Conclusion

From now on I’ll be following the record definition from desktopcouch including any changes the make in the future. I have NO problem in changing my code for a greater idea/project, as many times as it is needed, nevertheless that does not mean that I will not express my opinion about it :P

These are my two cents so far.. I really hope that desktopcouch grows to a point that we can have it in any/every desktop :D

UPDATE

After Rodrigos’ comment the record used is the following:

{
    'first_name': 'Manuel', 
    'last_name': 'de la Pena', 
    'middle_name': '', 
    'addresses': {
        '140f4631b7604ae9b7a9790c9a53a714': {
            'city': 'Brussels', 
            'description': 'Work Local', 
            'country': 'Belgium', 
            'state': 'Brussels', 
            'address1': 'Rue du Prevot', 
            'postalcode': '1050'
        }, 
        '2efab7d13cd74cbfb9c1b8ae0262a54b': {
            'city': 'Madrid', 
            'description': 'Work Local', 
            'country': 'Spain', 
            'state': 'Madrid', 
            'address1': 'C/Caracas 19 6 drch', 
            'postalcode': '28010'
        }
    }, 
    'application_annotations': {
        'macaco-contacts': {
            'logo': '', 
            'preferred_address': '2efab7d13cd74cbfb9c1b8ae0262a54b', 
            'type': 'Person', 
            'print_mode': 0, 
            'social_accounts': {}
       }
    }, 
    'title': 'Mr.', 
    'notes': '', 
    'record_type': 'http://www.freedesktop.org/wiki/Specifications/desktopcouch/contact', 
    'urls': {}, 
    'birth_date': '31-8-2009', 
    'phone_numbers': {
        '2e93b345fbea488685ba892e96c51c25': {
            'number': '22333', 
            'description': 'work'
        }, 
        'b88008f53a8c4430a602a22b851ab489': {
            'number': '9231933', 
            'description': 'work'
        }
    }, 
    '_id': 'fa5517a2bb4443759a74a986c87fa4b0', 
    'email_addresses': {
        '5235ca9d1c644137b74329f90dd3428d': {
           'description': 'Home Internet', 
           'address': 'mandel@themacaque.com'
        }, 
        'b504bb26b1c740dda4bbf3a483596a04': {
           'description': 'Home Internet', 
           'address': 'etil15@gmail.com'}
    }, 
    'im_addresses': {}
}

The code has just being updated(31-08-2009)

From News

  • rodrigo

    Great to read this! Some comments though:

    * we are already storing ‘title’ field in the top level, so just use that instead of storing it under ‘application_annotations/macaco…’
    * ditto for ‘middle_name’, although evolution still does not support it -> https://bugs.launchpad.net/evolution-couchdb/+bug/415297
    * ditto for ‘comments’, we call it ‘notes’
    * ditto for ‘webs’, we store them under ‘urls’, in the same way we do for email_address, phones, addresses
    * ditto for ‘ims’, we will store them under im_addresses, although this is still not done -> https://bugs.launchpad.net/ubunet/+bug/415302

    Also, for storing your preferred_address, you should just store the uuid of the address you want as preferred in the top level ‘addresses’ field. There is no need to duplicate the data.

    So, I suggest you to install evolution-couchdb in Karmic and test integration with that, since it supports most of the fields (and will support the missing ones very soon), so you can easily test working contacts for both applications

  • http://mandel.themacaque.com mandel

    That sounds great, I’ll implement the changes right now, it should not take much from my side.

    I’ll post about the integration between the two address books. It should work with no problem :D

    Cheers for the input!

  • Pingback: as days pass by, by Stuart Langridge — Desktop Couch IRC talk

  • Pingback: Stuart Langridge: Desktop Couch IRC talk | Full-Linux.com