sábado, 19 de dezembro de 2015

The VBA Modules Workbook - what a deal!

Today I decided that I want to share all the general purpose modules I have created so far all at once. And to simplify even more, I decided to share a workbook  that has all the modules.

Here is the link to download the workbook:

https://www.dropbox.com/s/pxxwgap8hl7m9xw/VBA_MODULES_43.xlsm?raw=1

I suggest you to download it, and check the module's lines of code. Most of them have comments (I intend to comment all the code in the following versions, but since this might take some time, I would rather share everything now).

What can be done with those modules one might ask:

1. Perform queries and commands in SQL DBMS such as SQL Server, Oracle Sever, MySQL, Sharepoint 2013 and Access using very few lines of code and with standardized input and output of data)

2. Send outlook e-mails as easily as it can be.

3. Handle 2-dimensional arrays in a very robust manner (something that VBA is poor at doing - Read the post "Explaining VBA Arrays (and its issues)" to see why).

4. Handling files more easily (opening, reading, writing etc.)

5. Generating log as easily as it can be done in VBA.

6. Handling data from the web and so on.

Although these modules are not something that will cause any revolution in the programmer's life, I'm sure it has many useful things for the daily activities since it simplifies basic stuff such as connecting to databases and handling arrays.

I intend to explain each module in detail in other posts. Until then, please be free to use everything.

domingo, 15 de novembro de 2015

Explaining the ACCESS_CLASS - And how to easily perform SQL Queries in VBA

Today I want to talk about SQL queries. Database management systems (DBMSs) are present in pretty much any company of the globe and it is very important to be able to fetch data from them (an to write on them as well) as easily as it can possible be.

Many people don't know how to get data from a DBMS such as the Access application using VBA. The reason for that is very simple: it is complicated to do it using VBA. You have to know the connection string, how to use the connection object, the record set object and to manipulate the output only to perform a simple query.

All this complication is reduced by using the ACCESS_CLASS which I have developed. It contains basic methods to connect to the database and executing commands on an access database file.

I have created an example workbook that gets data from an Access database. To use it, you need to download the Access database file (world.accdb) and copy it to the C:\TEMP folder and to download the ACCESS_EXAMPLE.xlsm workbook, that contains the example routine, and press the 2 buttons you will see when you open it. The link for both the files is displayed below. This example works in windows 2007+, Excel 2010+ and Access 2013 (you must be sure those are the programs you are using).

world.accdb file link:

https://www.dropbox.com/s/2vn10qa7p81nnqv/world.accdb?raw=1

ACCESS_EXAMPLE.xlsm file link:

https://www.dropbox.com/s/yqw5a7hydi6sgxz/ACCESS_EXAMPLE.xlsm?raw=1

The example routines (in the "MAIN" module) have many more lines of code than it is actually necessary to perform the task. The reason for that is that I commented the code using the log file, this is, the log is my substitute for the comments. I also have used many of the best-practices I talk so much in this blog to make the code as professional as it can be. So please don't be scared by the number of lines of code. Try to read them at least once and you will understand how easy is to perform SQL queries in Access files using the ACCESS_CLASS.

Another interesting point is that most of the methods present in the ACCESS_CLASS are also present in other classes to connect to other kinds of DBMSs such as Sharepoint 2013, MySQL, SQL Server and so on. So once you are familiarized with the methods (connect, execute, transaction etc.) you can use the same knowledge in other classes. The other classes differ basically only by their attributes. For instance: the access database requires only the file path to be set, while a SQL Server requires the driver name, the hostname, the user, the password and the database name. The Sharepoint, on the other hand, requires the url of the site and the GUID of the list to be passed as attribute.


domingo, 25 de outubro de 2015

Excel Tables - and how useful they are

Today we will talk about a topic that is useful for both VBA and Excel. The tables are one of the most useful Excel features and, we have noticed, are sometimes left aside by Excel users. Believing that the reason for that is the lack of real understanding of the potential of the tables, specially regarding scalability, we are writing this article to show their importance and how to use them.

Table is an Excel object that provide more tools to work on Excel ranges. The typical layout of an Excel table is shown in the Image 01 below.


Name Age Company
John 30 ABC
Peter 40 XYZ
Mary 24 XYZ
Frank 56 XYZ
Mark 34 ABC
Image 01
When a range is transformed into a table, it is automatically formated so it becomes similar to the Image 01. This formatting can be easily changed, but the layout in Image 01 is the most common.

Once a table like in Image 01 is created, some very useful tools become available. You can easily sort the table, filter it, show totals at the bottom of it (total count/total sum etc.). All this and other features are explained in many topics that are spread around the web, but those are not the foccus of this article. The foccus of this article is scalability.

Let's suppose we have the very same data of Image 01, but in the form of ranges. See Image 02

Name Age Company
John 30 ABC
Peter 40 XYZ
Mary 24 XYZ
Frank 56 XYZ
Mark 34 ABC
Paul 56 ABC
Image 02
And lets supose we have a code to add up the ages (we know this is something that could be easily done by the =SUM() function, but this is just an example). Let's see how the code could be:Image 02

Option Explicit

Public Sub Sum_Ages()
    
    Dim i As Long
    Dim aux As Double
    
    For i = 2 To 6
        aux = aux + int(Range("B" & i).Value2)
    Next i
    MsgBox aux
    
    Exit Sub
    
End Sub

This is not a really terrible code. It does work and it does solve the problem. There are many other ways to solve this problem, but this code is as good (or as bad) as any other that do not use tables.
If the reader has not seen the issue yet, let's imagine that a new row is created. Let's say that the ranges is now like in the image 03 below.


Name Age Company
John 30 ABC
Peter 40 XYZ
Mary 24 XYZ
Frank 56 XYZ
Mark 34 ABC
Paul 56 ABC
Image 03

As it can be seen, a new row, was added. In that case, would the code above still give us the answer we want? The answer is NO. The code adds only the ages from row 2 to 6, so the new line, the 7th, would be left out. To solve this one would have to rewrite the code to get it - something we are keen to avoid. So what would be the solution for this problem? The answer: tables.

Let's now convert this very range into a table and name it "tbl_employee" (to name the table, after you convert the range, you should go to "Design" and then "Table Name" and put the name you want).

Now we's have a table like in the image 04.

Name Age Company
John 30 ABC
Peter 40 XYZ
Mary 24 XYZ
Frank 56 XYZ
Mark 34 ABC
Paul 56 ABC
Image 04

Again, the name of the table is "tbl_employee". With that in mind, let's write a new code that does the same of the previous one.

Option Explicit

Public Sub Sum_Ages2()
    
    Dim myTable As Range
    Dim m As Long
    Dim i As Long
    Dim aux As Double
    
    Set myTable = Range("tbl_employee")
    m = myTable.Rows.Count
    
    For i = 1 To m
        aux = aux + int(myTable(i, 2).Value2)
    Next i
    MsgBox aux
    
    Exit Sub
    
End Sub

The difference between this code and the previous one is that now we do not fix the number "6" as the number of rows. Instead, we set the range object as the table we have just created and ask for the number of rows of it. So for as many rows as the table has, we are certain to get the right value. We have just made our table scalable vertically. The same worths, though, for horizontal scalability, since a new column could be easily captured by the ".Columns.Count" method of the range object.

As you can see, all we had to do was to convert the range into a table, name it and then use it in our code as a normal range. The outcome, though, was very different from not using a table, since we couldn't get the updated number of rows or columns that easily.

It is important to notice that the tables automatically understand new rows and new columns. A new value right below the last row is immediately understood as a new row. This makes adding new content to the tables extremely easy and, once again, all the objects related to the table are also automatically updated.

Now, again, this is a very VBA related subject. But using tables has many benefits that go beyond VBA. Once the range is converted into table, any object that references it will also become scalable by definition, since the references maps not the cells addresses (which are absolute and fixed) but the table's relative addresses, which are not fixed and are dynamic. Graphics, formulas and VBA are made scalable when using tables.

Bottom line: there is no reason for not to use tables instead of regular ranges. It is a powerful tool Excel provides and it should be used to make our projects more robust and easy to develop.






















domingo, 18 de outubro de 2015

Excel & VBA Best Practices

Here is our list of Excel and VBA best practices. This is only a suggestion, not a definite set of rules. We believe using these tips can make Excel and VBA much easier.


Use the "Option Explicit" clause on the top of every module. This has been explained in another post and makes the programmer debugging much easier

Don't use the type of data as the prefix of a variable name (hungarian notation). This also has been said in this blog. Once you use the Option Explicit clause and have declared your variables, there is no need to use hungarian notation.

Handle Errors. Using the "On Error GoTo error_label" clause is not only useful, but fundamental. It makes sure your code will not have unexpected behavior. The "On Erro Resume Next" should be avoided though.

Raise Errors. Once the exception happened and the error handling code is executed, if the function is not the main sub, the error must be raised, otherwise the parent function will not know about the error and this may affect the rest of the execution.

Log. Having a log module is crucial. We provided the LOG_MODULE that is perfect for the task and logging should be made with no restriction (performance issues are easily addressed by turning the log off).

Be explicit in every possible way when programming. Another subject extensively explained in this blog, being explicit is very important for the sake of the quality of the code. Using full addresses of ranges, declaring if the routines are public or private and using the "ByVal" clauses are just some examples of explicit information that should be in your code.

Program with simple organization and indentation.

Avoid data manipulation in cells - use arrays whenever is possible.This also has been explained in this blog. Data manipulation in cells is slow and should be avoided.

Use modules. Modules are great to organize your code and to make it portable. Actually this is the core of this blog.

Use the "Private" and "Public" clauses. If the function or sub will be used only in its module, it should be private, otherwise it will be listed outside its module for no reason.

Be aware of the performance. VBA programs are single-threaded. This means you can not make anything in parallel. So keep in mind that small improvements in program performance add up and make your programs much faster.

Log the error description. Everytime an exception occurs and is handled, the description of the exception should be printed to the log file so this information is not lost. This is particularly useful for connections with backends. Err.Description is the attribute to be used.

Use full addresses of objects. It is important to uset, for example, "Thisworkbook.Worksheets("Sheet1").Range("A1")" instead of simply "Range("A1")". Implicit information is one of the most problematic things in VBA.

Use "Long" data type instead of "Integer"

Use "vbNullString" instead of "". Not only because it is more elegant, but more efficient in some cases.

Give every workbook object that you use a name. This means that every worksheet that you use, every table and every cell that needs to be refereced in VBA should receive an unique and descriptive name. This way it will be much easier to reference those objects in VBA instead of using, for example, the "A1" or "E3" address references - which can can change when we add or remove rows and columns.

Use hungarian notation with workbook objects names (objects that are not VBA variables). In the other direction of VBA variables, the objects spreaded around the workbook should be referenced with names that contain a tip about its nature. For example, a worksheet could be named "wks_main" or "wks_something" where "wks" is the prefix that categorize it as a worksheet. A table could be "tbl_something", a cell "cell_something" and so on. This, believe us, can make programming much easier .

Use tables. For sure one of the most useful Excel object, the tables as also known as dynamic ranges. They should be used everytime you have a list of related objects - a table! Once you have transformed the range into table (and, of course, named it), any update on it is reflected immediately on related objects such as pivot tables and graphics. Not only that, adding, for example, a new row changes the table size automatically, so you will always get the correct size of the table in VBA using the ".rows.count" range method. This is one of the best tips we have to give.

Exception Handling in VBA

This post is about something often ignored by VBA programmers. Many people think that handling exceptions in VBA is simply useless. It is much better, they say, to see the error, open the VBA Editor and go fix it right away.

For those we ask: would you do the same if it was a Java application? What about a C++ application? 

Of course one of the main advantages of VBA is the fact that the programming environment is right there next to the application and ready to receive modifications on the fly. But our purpose here is to create programs that can run as stable as a python application, with no need of any kind of manual intervention on the code whatsoever.

One of the best ways to avoid your program to bizarrely crash is to use exception handling.

We don't want to spend time explaining what exception handling is because there are many other source on the internet that can explain it better than we. But for those who have no idea what exception handling is, let us try to explain it in our on way:

Exception Handling is the programmers response to an unexpected failure event of the system. In a more practical way: it is a part of code that is executed when another part of the code fails by any reason.

VBA has only one way that we know of dealing with exceptions: the On Error clause. Let us give an example of how it is used:

''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Option Explicit

Public Sub Exception_Handling_Example()
    
    Dim x as Double 

   On Error GoTo label_01
        x = 1000 / 0
        Exit sub

label_01:
   Exit Sub 
End sub 
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

The code below, when compiled, will present no errors. But when executed, it should generate an exception because we are trying to divide a number by zero. But since we've put the "On Error GoTo" clause, instead of halting the program and showing an error message, the program will simply exit.

The process that was developed for this program to end without halting is the Exception Handling. The On Error GoTo label_01 clause tells the program to jump to the label_01 part of the code in the occurrence of an exception.
All the lines of code below the "On Error" clause will be affected by it until another "On Error" clause is used. This means that one "On Error" clause invalidates the previous one so you can use different labels for exceptions  different parts of the code.

Now let us show you our suggestion of what a program with an Exception Handling should be.

''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Option Explicit

Public Sub MAIN_Exception_Handling_Example2()

    Dim function_name as String
    function_name =  "MAIN.MAIN_Exception_Handling_Example2"
    LOG_MODULE.Log function_name
    
    Dim x, y as Double 

   On Error GoTo error_1000
        x = 1000
   
   On Error Goto error_2000
         
        y = 1000 / 0
       
        LOG_MODULE.Log function_name, 0
        Exit sub

error_1000:
    LOG_MODULE.Log function_name, Err.Description
    LOG_MODULE.Log function_name, "The x caused a problem"
    LOG_MODULE.Log function_name, 1 
    err.Raise(1) 
    Exit Sub 

error_2000:
    LOG_MODULE.Log function_name, Err.Description
    LOG_MODULE.Log function_name, "The y caused a problem"
    LOG_MODULE.Log function_name, 1
    err.Raise(1)
    Exit Sub

End sub 
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

The program above has elements that we use extensively in all of our codes. This is really the basic layout of pretty much all of the functions we created and that are in our modules. Let us try to explain what is going on.

The first thing one must notice is that this program generates log. This means that the LOG_MODULE must be imported (the LOG_MODULE is available for download in this blog).

The first thing the function does is to log its name. To do so we create an variable, "function_name" and give it the name of the function module plus the function name - the function is in the MAIN module. After that we call the "LOG_MODULE.Log" function and pass the name of the function as a parameter. This prints the parameter - the function name - to the log file, which is generated by the LOG_MODULE.Log function. This is the explanation of the 3 lines below.

    Dim function_name as String
    function_name =  "MAIN.MAIN_Exception_Handling_Example2"
    LOG_MODULE.Log function_name

After that, we declare 2 variables, x and y as doubles. After that we start our 1st error handling scope: we use the "On Error GoTo error_1000" clause which means that if any part of the code from that point on generates an exception, the program must jump to the label "error_1000". This is true until the second error handling scope, which starts with the "On Error GoTo error_2000" clause, which means that if any part of the code from that point on generates an exception, the program must jump to the label "error_2000".

This gives our function 2 possible outcomes in the presence of exceptions: the code below the error_1000 label and the code below the error_2000.

One important thing to notice is that this program WILL halt in the event of an exception. This is because we put the "Err.Raise(1)" method in the end of each error handling. This method "raises" the exception so that if the function is being called by another function, the parent function will get an exception. Had we not put this clauses, the function would never halt, it would just log the err.Description message, log the "ERROR" message and exit the function.

This might be a little confusing for those who are not used to handle exception. But think this way: sometimes we can not predict what will happen to our program. For example, we can ask for a database connection and the database management system might be out, so we would get an connection error - which would cause an exception. In that case, we might not want the program to enter debug mode and show an error message, but rather show us, say, a friendly message or even to try to connect again without showing any messages. We would also like to see what happened written in a log file, so we wouldn't have to go into the source code and execute the program step by step to debug it.

That is exactly what this way of handling exception does. It gives the tools for the programmer to decide what to do in the event of a exception. And we strongly recommend the logging, so that the exception messages are stored in he log file so that the programmer can understand what happened.

Keep in mind that all the modules we will publish have this structure. They all generate log and they all handle exceptions. This makes them as robust as, say, Java or Python libraries (of course, giving VBA's limitations) and this strucuture is ideal to create real VBA application that are designed to work like any other real application.



Explaining the "Option Explicit"

Many of you may have noticed the "Option Explicit" code written on the beginning of our programs. Some of you may know its purpose, many of you don't.

The Option Explicit is a "compiler directive" which is just a fancy name to say that this has nothing to do with the program itself (it will not become part of the program binary code) but rather will be only used during the time of compiling (the time where code will be transformed into machine code - binary numbers).

To understand why we use this directive, read the code below

public sub awarenessTest()
    
  auxiliar = 30
  auxiliar = auxiliar - 15
  auxiliar = auxiliar + 20
  auxiliar = auxiliar +8
  auxiliar = auxiliar - 3
  msgbox cstr(auxilliar)
  exit sub

end sub

Now if we asked the reader to tell me what would be the value that would appear in the message box by the end of the execution of this program, unless you skipped 2nd grade classes, I'm quite sure your answer would be "40". But that is not the correct answer. The answer is "0".

The reason why the answer is "0" is because the message box does not print the "auxiliar" variable; it, instead, prints the "auxilliar" variable (with 2 letters "L"). This is a common "typo" mistake, hard to find in the code and that causes terrible crashes in programs. This kind of typo is so common that it, alone, justifies the use of the "Option Explicit".

The "Option Explicit" tells the compiler to only accept variables that are previously declared by the programmer (using the "Dim" word). So although declaring variables may seem like a wast of time, not declaring them (together with not using the "Option Explicit") may cause us a much greater wast of time debugging our code. If, in our example, we had used the "Option Explicit" and had declared the auxiliar variable, the message box line would make the compiler prompt an error message, since the variable "auxilliar" is not defined, making our mistake clear.

Another reason for using the "Option Explicit" is making the use of variable prefix unnecessary. Let us explain: some programmers like to put the type of the variable as a prefix of the variable name. For example, if we had a variable x as an integer, its name would be, say, "intx" or "int_x". Let us show a few examples of how this is done:

Dim dbl_age as Double
Dim obj_mail as Object
dim intX as Integer

This nomenclature - which even has a name: "hungarian notation" - they say, makes the program easier to be read. We strongly disagree.

Once you have declared your variable and told its type, there is no need to put its type again on its name. It is just... redundant. So if you use the "Option Explicit", you are obliged to declare it and therefore don't have to use hungarian notation.

We strongly recommend you to always use the "Option Explicit" code on the top of every module.


Why you should be explicit when coding

This is one of those subjects that people who suffer with VBA would ignore, since it seems like just a best practice (and who need those, right?), but that makes all the difference.

Being explicit is writing

''''''''''''''''''''''''''''''''''
dim x as String
x = Thisworkbook.Worksheets("Sheet1").Range("A1").value2 
'''''''''''''''''''''''''''''''''''
instead of  
'''''''''''''''''''''''''''''''''''
x = Range("A1")
''''''''''''''''''''''''''''''''''
The idea here is that while the 1st  gives an exact and undoubtedly address for the cell, the 2nd makes assumptions: it assumes that the correct workbook and worksheet are active.Although the programmer might had made this true in the previous lines by activating the workbook and worksheet, the odds are the he or she didn't.

What usually happens is that we write a code today and we are certain that, for example, there will be only one workbook opened or that that worksheet will always be activated, so there is no need to specify the full address of the range. But, in a point of the future, we change the code without remembering this assumptions and the program crashes with no apparent reason - and this kind of frustration is one of the main causes people hate VBA.  

The ".value2", which is related to the content of the cell, is also a form to avoid unpleasant surprises like getting a formatted value of a cell instead of its content. It also makes the code clearer : "x",  in the example, is not a range object, but a string so, although the compiler is smart enough to understand that when assigning the range to x you actually meant its value, you shouldn't do that. This is one of the most important things we have to teach here: do not let the compiler make assumptions for you. This WILL cause problems in the future. It is important to always be 100% straight about what you are doing. So please do yourself a favor and use the full address of objects and make explicit that you want values instead of the object itself.

Another thing we would like to point it out can be seen in the two functions below:

'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
function f1(x)
    y = x*2
    f1 = y
end function
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' 
Option Explicit

public function f2(ByVal x as Double) as Double
   
    dim y as Double

    y = x * 2

    f2 = y   

    exit function

end function
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

Although both functions would work perfectly, it is clear that the f2 function was written with much more information. We know, by reading the code, that the function is public (which means it can be accessed from other modules), that x and y are doubles, that x is passed by value (which means that x has the same value of the argument passed when the function is called, but it is not the argument itself) and that the function returns a double value. All this information is implied in f1, which makes much harder to understand what the programmer really wanted - we mean: did the programmer actually wanted the f1 function to be public or did he simply forget to declare it as private?

Being explicit is one of the best ways to avoid small mistakes when programming in VBA. This goes in line with our statement in the MAIN post (the 1st post of this blog) which says that we do believe a 10% overhead to implement best practices in our code is rewarding. The reward is fewer errors and an easier to maintain code.