A Hierarchical Tree is normally made after computing a distance matrix between all the leaves of the tree (btw this is done for all the levels of the tree, one distance matrix per level.) In your case that would mean to compute the distance matrix between all your columns considered as points being represented in a 250.000 dimensional space.

Please have a look at the definition of what is the "Curse of Dimensionality" at the following link (Wikipedia: http://en.wikipedia.org/wiki/Curse_of_dimensionality). In essence, what the Curse of dimensionality says is that it doesn't make sense to compute distances in a very high dimensional space (which is your case.) The Tree you will get won't make much sense.

If you want to implement Aaron's suggestion (which for me is perfectly right), you will need first to reduce the dimensionality of your columns to a reasonable size while preserving the underlying relationship between columns. My suggestion is to do a K-means clustering of your data (with for instance K=1000) before transposing the table. Then you need to convert the k-means vectors into a table (of 1000 rows x # columns), transpose this table (# columns x 1000 rows) and eventually achieve the Hierarchical Clustering on the transposed table of K-mean vectors. If clustering your 250000 data into 1000 clusters by k-means makes sense, then you should get a pretty nice tree representing the hierarchical organization of your initial columns.