Apostol Apostolov

Practical thoughts about software

Fixing large amounts of RavenDB conflicts

Some time ago I had a small issue with RavenDB master-master replication which ended up in creating 50 000 conflicted documents on a live databaseSmile  It all ended fine but I had a little hard time of finding info on how to fix that kind of issue ‘fast’. Don’t get me wrong – there is a very nice RavenDB documentation about replication and conflicts but I didn’t find a fast-and-easy solution(one that involved couple of clicks in Raven’s studio) and I was a little disappointed about that. Guess RavenDB makes you little spoiled.

Anyway, here’s my solution:

1. First, you create an RavenDB index(you can create it through the studio):

Let’s name it ConflictsIndex. This index extracts all the documents of a database in an index so you could query them.

2. Then you Execute it to see the results.

execute

Don’t worry if there are no results at first. If you have large amount of data and large amount of conflicted documents – RavenDB would need some time to index them all. During that time you could:

3. Create an Console app that fixes the conflicts of the documents.

This console app will load all the documents from the currently created index in batches of 1000(usually RavenDB has a default limit 1024 items per select so we use 1000). While loading the documents if there are any conflicts, they will be automatically fixed with the TakeNewestConflictResolutionListener. This listener chooses always the last document to be the resolved one. If you need different custom logic for resolving the conflicts – the listener is the place to insert your code.

4. Execute the console application on the database. You could wait for the ConflictsIndex to finish indexing and run the console app.  If the database is a remote one(most production databases are) – you could copy the console app on some PC that is in the LAN of the PC with the database so the process would be faster.

Also it’s good to put a default Conflict Resolution Listener in your application if you don’t have one. Just in case. If you don’t have one – every time you try to load a conflicted document  - an exception is thrown, so the situation could get pretty heated pretty fast.

And that’s it. Usually you shouldn't get to situations like that but when you do - it’s good to know how you can get out of them.

Tags:

Published at

Originally posted at

The challenges and value of using Git

Some time ago I wrote about my assessing of Using Git versus using SVN in the company that I work for.  It’s been about an year this month since we’ve successfully adopted and moved all the company’s source code to Git and GitHub in particular. I wanted to talk about the process and the issues we had in doing this. Before I begin I must say that I and probably the whole team already are pretty convinced that moving to DVCS and Git is one of the best things that we did. I helped us so much with our feature development that the problems we first had with it were quickly forgotten.

In my previous post that I mentioned above I’ve laid out the problems we had. Now I’ll explain how we work now – one year later.

The company I work for is product company with a couple of web projects(products). Our development team is about 7-8 people in total(including me) and each person is responsible for the development of 1 main feature at a time. He’s also maybe supporting other feature’s he’s built that are live or not actively developing at the moment.

For this way of development we have one main repository where we keep most of our code. The repository is a single one for all our projects(not one for each project) due to the fact that we have a couple of libraries that are shared between all the projects and the most painless scheme to support for developing/sharing the libraries between the projects is all the projects being in one repository.

That said, the way we use the repository that makes us super flexible is by having one branch for each of the features we’re developing. That gives us the flexibility that when we work on a feature, let’s call it Feature-X, we’re able to develop it on Branch-X by the developer in charge. When someone else(a colleague or a manager) needs to see what it’s going with Feature-X he could just switch the branches, load the project in Visual Studio and hit F5. There is always one ‘master’ branch where the ‘live’ version is and that give us the ability to quickly fix a bug and deploy to production without anyone having to do any extra work. It’s awesome. And super flexible.

Anyway I wanted to talk about the challenges we had when integrating Git in our work process.

The team

There is one particular challenge with core .NET developers – they don’t trust technology that is not coming from Microsoft. And Git is not a Microsoft technology. It something that is open-source  and “free” and not-to-be-trusted. For example there is no good Visual tool for working with Git, there is no good integration with Visual Studio(like a plugin), and the official GitHub For Windows app is very nice but sadly still pretty limited in its capabilities. So the best way to use and learn Git is to work with the console. Which my teammates felt like a step back rather than a step forward in our tooling.

The tooling

Frankly there are a number of nice open source Git GUI applications out there the best in my opinion is Git Extensions. But there is a big challenge with these applications. If you come from an SVN or TFS experience background and you don’t know what’s going on with Git - you could get VERY  frustrated with the visual applications. They give you all the power of all the Git commands at your fingertips but when you don’t know what the commands do… well - it becomes a mess, trust me. So the best way to learn git is by using it through a console. That way - when you need something done, you'll look up the command you need, learn what it does and then use it. And that way you’ll know Git a little bit better.

Well, to be frank - there are couple of things that are common everyday work which is Commit, Push to server, Pull from server and Switch Branches. The GitHub For Windows app that I mentioned above does these three things very well so I use a combination of that GitHub app for the basic operation along with command line for all the advanced merging, rebasing, cherry-pick etc.

The Learning Curve

Talking about merging, rebasing, cherry-pick… Well GIT is not like SVN… Git is really simple but if you don’t ‘get’ it – it become really hard. There are a lot of commands and a lot of stuff you need to understand if you don’t want to bang your head against a wall. It’ll be a lot easier if there is a person in your team(or in your company) who could explain the theory behind DVCS and graph representation of commits and branches and what is Git doing behind the scenes with those hash keys on each commit… well there’s a lot of stuff. But it’s a lot simpler than what it sounds like.

If you’re want to learn more about Git there are a couple of materials that I found are very good.

But the one that opened my eyes the most is the video Git Happens by @Jessitron. Really recommend it if you need to ‘get’ Git.

Line Endings

Ohhh… the line endings. This is one of the most frustrating things in git – dealing with the different line endings for the files in your repository. The problem here is that Unix, Mac OS and Linux operating systems use LF for denoting the end of a line, while Windows uses CLRF. That results in very frustrating situations in Git where you haven’t changed a file but nevertheless git picks it up as a changed file and wants you to commit it.

To fix the problem with line endings there is a simple fix that you must do before you make a repository and that is to setup a .gitattributes file in your repository with the proper configuration.

Here’s the .gitattributes of one of my projects:

If you need more info on the subject you can check out Mind the end of your line and the GitHub article on the subject.

The paid version

I have to note also that there are a couple of companies that offer premium DVCS for the Enterprise and help you deal with all the challenges that I talk about in here, but they come at a price – $25-$30 for person/per month. That is not so much, considering the salaries and overall expenses that a company has for a single developer, but… there are different companies and for some it’s acceptable and for some it’s not. Maybe for more developers this price adds up or some companies just don’t want to have to depend on such a company for the core of the business – the source code. But it’s good to know that there is this kind of premium solutions on the market.

Conclusion

As a conclusion I have to say that yes – transferring to a DVCS is not an easy thing. It takes time and effort to get it right. But it’s TOTALY worth it. It gives you the freedom to work as you like on the features you’re developing – use branching and merging without any fear and not have to conform with the tools you use(and you don’t need to shout in the room – “Everyone – no more commits because I’m merging Open-mouthed smile”).

I’ve gotten used to the easiness of work that much, that actually cannot imagine working in an non-DVCS environment so I strongly recommend it to every company out there. And even if your company doesn't want to transfer to a DVCS source control – go ahead and make an account at GitHub or BitBucket for your personal project. It’s a good experience and you never know when you’re going to need it or how it’s going to change your life.

Tags:

Published at

Originally posted at