Wednesday 24 July 2013

Week 5

Adventures in Git and Bazaar

As the mid-term evaluation for GSoC draws nearer, I have to submit all the work I've done so far for review to Mailman Mentors. The GNU Mailman project uses Launchpad to host all the code. This made things quite difficult, since the work I've been doing so far has been in Git, hosted on my mentor Richard's server, while Launchpad uses the Bazaar version control system.

In order to upload the code that we've been working on in Git on Launchpad, we used a plugin git-remote-bzr. Using this plugin, we were able to interact with Launchpad via git, by making an individual bzr branch into a git "remote". Then, you are able to fetch/push to that remote just like any other. 

It sounds quite simple, but it took pain and suffering to actually work it all out in a decent way without messing everything up. For one, git-remote-bzr would only work with the master branch (and no other).  So after trying the git syntax for updating a remote branch with a different local branch (among other things), I finally ended up just merging master with the branch I actually wanted to upload (which was fortunately simple in this case) and then pushing out to master.

Another trouble that I encountered was that at one point in Postorius' commit history, the commits diverged. The regular branch which I "forked" on launchpad was ahead of the common point where the branches diverged by a few commits.

So I had to delete that and make a new branch (specific to that common point), and then ensure that all subsequent changes that I pushed came directly after the common point. It should have worked, but even after getting to the common commit, I was still getting the error saying I had extra commits.

In the end, Richard pointed me to a git version of `bzr missing`, which told me I had the extra commits (the commits made after the divergence point. Removing the .git/bzr directory and doing a fetch again fixed that and I was able to push!

Keeping up with the schedule

You learn in Software Engineering class about how projects have requirements that change frequently, how things don't always go the way you planned them out, and almost always, you will end up doing things differently than you envisioned. That has been my experience so far in this project. I would say this has been a very important lesson in real world development.

Starting out, I had a plan about what I thought would be the way the summer would go and how I would have to work on the project. I had a timeline, which was a rough sketch of what I expected to happen, and how I expected to achieve it.  Suffice to say, that was not at all how things worked out once I started doing the actual work. 

Transitioning from my schedule to the new one was easy enough. It was not like the milestones I had were set in stones, and I quickly realized that the way to achieve the goals was to work on the new schedule that we were following. The way to do that is to just have enough "wiggle room" and don't set very hard expectations about what might/might not happen. You will never know what problems you might be faced with, how long it might take to tackle them, and how that will affect your plans. 

Sandbox Sandwich

This week, instead of focusing on the actual repository, Richard and I worked on a separate "sandbox" repository, which simulated how the interactions between the 3 major components (Mailman Core, REST Interface, API Clients) are handled. This was done using 3 different Django apps which handle models and use an Interface to communicate with the other layer. Not all that exciting, but it has been helping me understand the way the system should work a lot better than before. :)

Tuesday 16 July 2013

Week 3-4

Past few days haven't been very good. Having a lot of slow days where I have spent the majority of time sick, although I did manage to get some work done.

The DRF (Django REST Framework) integration has started, and I created Serializers  and ViewSets for a few "First Class" models, which the DRF uses to expose a REST API.

Serialization is the process which translates data that can be stored (in this case, coming from the SQLite database via Django-ORM) into a format which is easy to transfer across the network (JSON/XML). De-serialization is the opposite where we re-construct the database-friendly format from the JSON/XML. DRF does this by providing Serializer classes, with some built-in classes for easy serialization of Django Models. Integrating them wasn't difficult.

The Django REST Framework comes with several built-in Serializers which integrate very well with the Django models. This works out very well, because we get certain things (like Pagination) for "free" without actually writing code for handling that particular feature. The resulting API also maps the Relationships of the models with each other well. It also allows for many extensible features in case we want to customize anything instead of using the default (which will be the case in this project, I suspect).

The data being exposed right now via DRF is in a format that it generates by default. In the future, that would change to providing a customized "scheme" which we will use in the data format (something like HAL for the JSON format).

In short, this current version of the API gives us a (somewhat blurry) picture of what we can expect when the project is finished.

After making a rough preview of the API, I have started working on what will be the hardest part of the project so far, which is to have the database of the API models (the ones that I have been working on) communicate with the Mailman Core database, and do read/write/update/delete operations on both of them simultaneously.

I had a discussion with my mentor about the role that the new models will play in regards to the Core, and he said to treat my models and the Core as Peers, instead of making the assumption that Core will be the "main" database. This gives us the advantage that we don't have to think about everything in terms of how it interacts with -Core, but can be independent of it, sharing only the data it requires.

To interact with the Core's API, I practically used the same code that was removed from mailman.client, with almost little to no modification as of now.

The models that are at the Core and at the REST differ in their structure and their relationships, which is why it won't be simply as easy as to drop-in a generic function that updates everything at the Peer model whenever something is saved in the Django models.

So far, I have managed to make it work for a single User model's write operations, so creating a new User from my models also creates a corresponding User at the Mailman Core.

Doing this wasn't quite easy. I encountered a few roadblocks. At first, I was trying to do the creation/updation at peer via Django signals like "post_save" and do any API-related stuff when that signal was triggered, but that was getting more complicated than I expected. The next day, I figured out a (admittedly hacky way) to make the create/update of User directly inside the Model's .save() function.

So right now, I'm in the middle of doing the task of communicating with -Core for other models, which will occupy the upcoming week. Hopefully, it won't be as painful as I'm thinking it is going to be. :)

Tuesday 9 July 2013

Week 2+

With the completion of the models in week 2, the aim was to create a "Simulator" for Postorius, which as the name implies, was to simulate the functionality using new models that I have been working on.

Week 2 was spent working on simultaneously updating the Models and any integrated changes that I had to do in Postorius, as well as removing any and all traces of the older mailman.client module, which was being used as a way for Postorius to communicate with the Internal Core API.

The purpose of removing mailman.client is to showcase that pretty much every core functionality that the core has exposed is now being redirected via the new REST models that I have been working on, without anything going  directly through mailman.client. This does not mean that the client code will become redundant or removed completely.  Part of mailman.client code will still be used a way to query the Core API to get results, which will just go through the new code, where the results will get cached in the local database, to make lookup faster, instead of directly going to Postorius.

Most of the time was spent on working out the branching model for the work that we were doing. This was the first time that I extensively used Git branches in order to separate out the various features of the project. It wasn't very obvious or easy, and it took a while for me to figure out everything. My mentor, Richard has been extremely patient with me and helped me understand how everything worked.

We created separate branches for making any changes in the Postorius code related to the models/views etc, and one for removing traces of mailman.client. Then, after all those changes were complete, I merged them into a new branch called m-simulator.

I also created a Trello board in order to keep track of the progress and any issues in the project. Since we aren't using Github/bitbucket etc, this seemed like the best way to do it.

The plan for the upcoming week is to start integrating Django Rest Framework, in order to get a "preview" of the REST API we might get at the end of the project. Once that is done, we will work on a way for the new models to communicate with the core API, a job that was previously being done by mailman.client.

Lastly, the title of this blog is Week 2+ because I got sick at the end of the week and couldn't finish either this post, or continue the work for the project. So a couple of days were wasted there, and this post is being published later than it should have been.

That's it for now. Until next week.