To understand what the structure element means, just imagine what the MT is doing on a single pixel, anyone.
So, the algorithm is on the pixel (i,j). Now, it places the struc. elem. "over" that pixel, centered. It creates a new array, a vector, to store the pixel values that will be used. Now, you look through the struct. If it has a 1 value anywhere, you read the value under it and copy it to the array. Do the same for every pixel in the struct. At the end, you have the collection of values, and you may perform the statistical operations you selected (minimum, maximum, etc.)
To summarize, the structure element defines the neighbourhood of every pixel.
0 1 0
1 1 1
0 1 0
the 3x3 cross, is telling the algorithm to look a the current pixel, and all 4 pixels that are contiguous to it.
0 1 0
1 0 1
0 1 0
this is almost the same, but you'll ignore the value of the current pixel.
I'll try to create a few examples of structures and different shapes tomorrow, and the results on simple images.