Showing posts with label Tools. Show all posts
Showing posts with label Tools. Show all posts

Monday, October 31, 2011

Parsing CSS for analysis

This time we’ll look into parsing CSS so we can use the result for analysis.

The story

As we are starting a new project using ASP.NET MVC3 one of the things I was thinking about was how to better manage our CSS and JavaScript. The first thing that I figured would be great was if you could to some static analysis on all of your CSS files to see if certain situations arise.

The issue with multiple people working on the same HTML based UI is that it can be hard to find out what CSS class to use. Because of that you might end up with classes doing the same thing, classes being defined more then once with different definitions, etc..

All this leads up to a less then ideal situation. Removing the cause of the problem, not knowing what CSS class to use, proves difficult. Helping counter the symptoms of the problem is proving a lot easier, so that’s the path I chose for, right now.

A two part solution

In order to do static analysis on something that is only available to me in plain text, I first need to parse the plain text into something that is easier to analyze. Then I can use the result to do the actual analysis on and communicate the result.

Analyzing how to parse CSS

To write a parser for any language requires a deep understanding of its syntax. Fortunately for me CSS is not a very complex language. W3Schools.com proves to be very helpful in providing us with an explanation on the CSS syntax. It comes down to this:

  • A CSS document consists of CSS rules
  • A CSS rule consists of a Selector and a set of Declarations
  • Each declaration consists of a property and a value
  • Comments can exist at the top level of the CSS document (outside of the CSS rules) or in between declarations

In order to do static analysis on this I came up with some interface definitions that allow me to query the structure of a CSS document:

   1: public enum SelectorType
   2: {
   3:     Tag,
   4:     Id,
   5:     Class
   6: }
   7:  
   8: public interface ICSSDocument
   9: {
  10:     string FilePath { get; set; }
  11:     IEnumerable<IRule> Rules { get; }
  12:     void AddRule(IRule rule);
  13: }
  14:  
  15: public interface IRule
  16: {
  17:     ISelector Selector { get; set; }
  18:     IEnumerable<IDeclaration> Declarations { get; }
  19:     void AddDeclaration(IDeclaration declaration);
  20: }
  21:  
  22: public interface ISelector
  23: {
  24:     string Name { get; set; }
  25:     SelectorType SelectorType { get; set; }
  26: }
  27:  
  28: public interface IDeclaration
  29: {
  30:     string Name { get; set; }
  31:     string Value { get; set; }
  32: }


Note that this might not be final as I might come up with requirements implementing the static analysis.



For parsing text like this there are always several approaches. In this case I decided that using plain and simple text parsing would be the most flexible, as I might want to add features to the parser in the future. Here is what I came up with for the main parse loop:




   1: public void Parse()
   2: {
   3:     string data = File.ReadAllText(FilePath);
   4:  
   5:     _position = 0;
   6:     _isInComment = false;
   7:     while (_position < data.Length)
   8:     {
   9:         if (IsEndOfFile(data))
  10:         {
  11:             break;
  12:         }
  13:         HandleBeginOfComment(data);
  14:         HandleEndOfComment(data);
  15:         if (!_isInComment)
  16:         {
  17:             HandleRule(data);
  18:         }
  19:         else
  20:         {
  21:             _position++;
  22:         }
  23:     }
  24: }


As you can see I have a (private) field to keep track of the position within the CSS document and another field to keep track of comments.



You might find the IsEndOfFile method weird as I have a condition within the while loop that should do the same thing. However I need to check ahead one position in case I’m still checking for comments (or are in a comment for that matter). The definition of the method is quite simple:





   1: private bool IsEndOfFile(string data)
   2: {
   3:     return _position == data.Length - 1;
   4: }



The HandleBeginOfComment method checks for the start of a comment:





   1: private void HandleBeginOfComment(string data)
   2: {
   3:     if (data[_position] == '/' && data[_position + 1] == '*')
   4:     {
   5:         _position += 2;
   6:         _isInComment = true;
   7:     }
   8: }



Basically it checks for the string /* and if it finds that string at the current position it moves the cursor by two characters and sets the _isInComment flag. HandleEndOfComment does the same thing for */ and sets the _isInComment flag to false again. Any comments are currently ignored, but it is easy to extend the main parse loop to allow for parsing comments as well.



The HandleRule method takes care of all the parsing magic, which makes sense as the Rule is the main component of a CSS document.





   1: private void HandleRule(string data)
   2: {
   3:     while (_position < data.Length && !StartOfRule(data[_position]))
   4:     {
   5:         HandleBeginOfComment(data);
   6:         if (_isInComment)
   7:         {
   8:             return;
   9:         }
  10:         _position++;
  11:     }
  12:     string selectorData = GetSelector(data);
  13:     string declarationsData = GetDeclarations(data);
  14:  
  15:     IRule rule = _kernel.Get<IRule>();
  16:     
  17:     ISelector selector = _kernel.Get<ISelector>();
  18:     selector.Name = selectorData;
  19:     selector.SelectorType = GetSelectorTypeFromName(selectorData);
  20:     rule.Selector = selector;
  21:  
  22:     HandleDeclarations(rule, declarationsData);
  23:  
  24:     AddRule(rule);
  25: }



The first loop deals with running into a comment later on in the document. If we do run into a comment we simply return to the main loop, which will then deal with finding the end of the comment. In the same loop it checks for the start of a Rule.



If we are still in the method after this loop, we have reached the start of a Rule. As we’ve noted earlier a Rule consists of a Selector and a set of Declarations. The methods GetSelector and GetDeclarations take care of parsing those portions of the CSS document. Once we have that data we can use it to create a rule. We use a Ninject Kernel to create instances of both the IRule and ISelector implementations.



Note that right now we handle the Selector like it’s a single entity. A future improvement might be to split up the Selector into parts and assign types to them individually.



The HandleDeclarations method takes the declarations text, parses it into IDeclaration implementations and adds them to the given IRule:





   1: private void HandleDeclarations(IRule rule, string declarationsData)
   2: {
   3:     string[] declarations = declarationsData.Split(new char[] { ';' }, StringSplitOptions.RemoveEmptyEntries);
   4:     foreach (string declaration in declarations)
   5:     {
   6:         if (string.IsNullOrWhiteSpace(declaration))
   7:         {
   8:             continue;
   9:         }
  10:         int splitterIndex = declaration.IndexOf(":");
  11:         string declarationName = declaration.Substring(0, splitterIndex).Trim();
  12:         string declarationValue = declaration.Substring(splitterIndex + 1).Trim();
  13:  
  14:         IDeclaration declarationInstance = _kernel.Get<IDeclaration>();
  15:         declarationInstance.Name = declarationName;
  16:         declarationInstance.Value = declarationValue;
  17:         rule.AddDeclaration(declarationInstance);
  18:     }
  19: }



Note that I use String.Trim to make sure we don’t end up with white space in our declaration data, which could get in the way of our analysis (and is of no value any way in CSS).



So far so good. We can now parse CSS into an object model, which allows us to analyze the CSS in a structured way. I plan on writing a next post that shows the analysis based on this model.



No complete source included: Unfortunately, as this project is owned by my employer, I can not include full source code, however the parts I included should provide you with a good insight into parsing CSS.

Thursday, June 30, 2011

Code generation is a basic developer skill

“Give a good developer a 40 hour task and he’ll spend 39 hours writing a program that can do the task in 1 hour.” – unknown

Really? This seems risky!

It can look risky, can’t it. You might end up with a half working program and not enough time to do the job. There are approaches that reduce this risk, but more on that later.

Why should you even bother? Simple math tells us it’s not faster to do so (at least not in the above example). However, experience tells us that in general most jobs will be repeated. Now, with the above example, if it’s only repeated once, you have a 39 hour profit, as you can now do the job in 1 hour.

Obviously, things don’t work like that in the real world. Automating a process usually takes longer than actually manually doing the work. In fact, when you want to automate something, that usually means you first have to do it by hand at least once, so you know what you’re automating.

But it’s also unlikely that you will only repeat a process just once. Often tasks are done many times and then it becomes worth the effort of automation. Let’s look at a current example.

A common case

As some of you may know, I’m working on a line of business application with a Silverlight UI and a WCF service layer (although the used technology doesn’t really matter for this case). Part of writing most line of business applications is writing business objects that represent your model and writing logic to retrieve data from the database, either directly or through some ORM.

Another part is writing lot’s of “forms” if you will, to input data into. Usually these forms follow a common pattern. Both of these parts, the business objects and the forms, contain a lot of repetitive work to do manually. Some of the code can be moved into generic super classes, but most of the code is of a nature that doesn’t allow for generalization. This is where automation comes into play. In this case, why write all that code yourself if you can also generate it?

When we first started out with this application, we did set out to build some code generation. Unfortunately most of us didn’t have a lot of experience with code generation as such, so wrong choices were made (C# generation with a StringBuilder is not the way to go and the same goes for Code DOM). After the initial versions of our code generation engine, we were forced to spend all our time on getting a first release out to our customers and no more work on automation in any form was a predictable result.

So, are you a good developer?

We all fall into this trap at some point. I know I have several times. You let a deadline pressure you to much, or you go back to your old naïve thinking saying that this will just be some demo code and it won’t end up in production anyway. Whatever the reason, we all end up breaking our own rules from time to time.

Don’t get me wrong, that doesn’t make you a bad developer. In fact it means you are focusing on how to providing value to your customers, which is key for a good developer. Also, breaking these rules from time to time keeps us alert on why they are rules in the first place.

However, we should also take time every now and then to take a step back and have a look at what we are doing. In my case I found we should put a lot more effort in making our development process more efficient. This means making more code generic, refactoring existing generic code to fit our needs better, and generating more code.

Generating code, a daunting task?

To some developers generating code may seem like a daunting task. Don’t worry. That’s a good thing, because you won’t just dive in without giving it some thought. As you can see from our example, having a developer that things “Oh, this is easy. I’ll just take a StringBuilder and push out lots of .cs files" is far from ideal.

Actually the first step in code generation is not writing a code generation process. It all starts with meta data. If you don’t have something to base your code generation process on, then how are you going to generate code in the first place?

In case of a line of business application, often your data model can provide you with some meta data. In fact we started out with a stored procedure that would take meta data from our data model in SQL Server and put it in another database. Then we would generate source from that data.

Another source for meta data may actually come from people entering that data into an application. We use that too, to allow our functional specialists to provide us with information to generate the forms in the GUI.

Once you have your meta data in some form or the other, you can write your generation process. Actually, that part doesn’t have to be hard. In the end it’s just creating text from data, which most web developers have been doing for years. The main problem is integrating your process with your IDE. That is where T4 comes in.

As some of you may know T4 is already used in Visual Studio to generate source code. It also provides you with a simple button to trigger the execution of all T4 templates within your solution. This is obviously a lot easier than adding files generated by some external program.

As for editing these templates, Visual Studio doesn’t come with standard support for this (it’s treated as text by default, so no IntelliSense). If you are serious about T4, you should check out Tangible. They have an editor for T4, integrated in Visual Studio, both in a free edition as well as a pro edition with more features. Both features do come with IntelliSense and Highlighting, but the free edition only supports limited namespaces and assemblies. You can find them here.

But what about those risks?

I hear you. Starting with this it can be scary. The trick is to start small and then keep on growing your code generation process to support more and more scenario’s. This way you can work on your code generation in iterations and have a working end result often. This reduces the risk, because you can, at any point, decide that you will no longer extend the code generation process, but extend the generated code by hand.

Also, before generating any code, you should have at least written and tested it once, to make sure you know the structure of your code and what meta data is needed in order to generate it. This way you also know you will end up with code that works.

Conclusion

I hope you realize the potential of code generation and how this can really make you more productive. I can tell you from experience that it also makes software development more fun, even if you have to build a lot of the same. With the right tools I’m confident that any development team could benefit from automating their work.

Monday, May 9, 2011

“A developer is only as good as his tools”

In this post I’d like to share some tools with you, that I think may be of use for any developer.

You may have come across the expression that is the title of this post. Some think this is not true and others swear by it. I’m sort of a middle of the road kind of guy when it comes to tools, as I do think that a tool is only useful when used effectively. However, our lives would be a lot harder without proper tools for development. For example, development becomes a lot easier with a highly tuned IDE like Visual Studio 2010, instead of typing code in notepad and compiling it directly with msbuild.exe.

As it turns out, I’m migrating to a new laptop for my work, so it seems like a good time to write about some of the tools I install on my new laptop.

As people tend to get very emotional on this subject:

DISCLAIMERS: This post is by no means sponsored in any way, except for the Google ads already in place. Also, most of these tools are NOT specific for developers. Some are useful for most PC users, some are useful for IT people and some are developer specific. This post is also by no means exhaustive. Finally, I will not talk about the obvious things, like Visual Studio, SQL Server, a Virus Scanner, etc..

LiberKey

One of the very first things I always install on a PC I use is LiberKey. This is a free tool to manage your tools. It comes with a massive (at the moment of writing this) 309 applications in 10 main categories, ranging from office tools to system utilities and anything in between. Applications like Notepad++, Paint.NET and HxD are all included.

The cool thing about LiberKey is that you don’t have to actually install anything to your hard disk. All the software just works, whenever you have LiberKey active, including full OS integration things, like explorer integration. It also deals with keeping all these applications up to date and new tools are added often.

Check it out at www.liberkey.com.

NuGet

If you haven’t heard about NuGet, you should check it out. This is basically an extension for Visual Studio 2010, that allows you to search for open source libraries and then immediately install them and add them to your Visual Studio projects.

What makes this so useful, is that you no longer have to and find a download source for these libraries, find out how to install them and then add them to your projects. You need a framework for dependency injection? Right-click your project file and click “Add Library Package Reference…”, type dependency injection in the search box and it will come up with open source libraries for that. Choose one, click install and you’re ready to go.

Check it out at nuget.codeplex.com.

Fiddler

If you’re going to do some work involving HTTP, this tool rocks. It allows you to follow HTTP and HTTPS traffic as long as the client runs on your machine. You can inspect messages going back and forward, look at time lines and even create your own messages.

Check it out at www.fiddler2.com.

XMind

I’m a fan of mind mapping. I use the technique for organizing my thoughts when working on complex problems and also for certain types of meetings to make notes. XMind is a great tool for making digital mind maps. I has great keyboard support, which makes me very productive without distracting from the though process.

Check it out at www.xmind.net.

smtp4dev

As a developer I frequently need to write some code which sends out emails. As you may be aware, testing code like that against an actual SMTP server in production is a hazardous operation. Not only that, it also tends to clutter your inbox with test emails. In some environments it can actually present security issues as well, where virus scanners block unknown processes which send emails, in order to prevent malware attacks.

To actually have a way of testing code that sends emails, I use this great open source tool, called smtp4dev. It runs like an actual application, but it also functions as an SMTP server on your machine. It has a UI that allows you to easily configure and manage it. The UI also allows you to inspect any emails received by it’s SMTP server. Further more, it doesn’t relay any emails, so you can’t accidentally send out emails.

Check it out at smtp4dev.codeplex.com.

The Regulator and Regulazy

Two great tools (by the same developer) for people who need to do work with regular expressions. Regulazy allows you to quickly create a regular expression from a piece of text through a UI. Although it is limited in what you can do, it does usually provide you with a good basis to extend on.

The Regulator is a more comprehensive tool, which allows for better analysis and testing of regular expressions.

Check both of the out at osherove.com/tools/.

LinqPad

If you work on a project that uses Linq and/or Entity Framework, LinqPad is the tool for you. This tool allows you to run Linq queries against your own Entity Framework model on the fly. This makes it a great tool for testing Linq queries, without actually having to compile and run your application every time. For the TDD evangelists out there who will tell you, you should never have to run your application to test such a thing, not having to compile can be a great advantage as well.

Check it out at www.linqpad.net.

Conclusion

Those are just some of the tools I use. Do you have any other tools you think people should at least know about? Leave a comment.