Principal component analysis
Principal component analysis

Let's look at use of the word thee and visualize it as a dimension, or axis.Each of Shakespeare's works can be placed on that axis,like a data point, based on the number of occurrences of that word.In statistics, the tightness of these points gives us what is known as the variance,an expected range for our data.But, this is only a single characteristic in a very high-dimensional space.With a clustering tool called Principal Component Analysis, we can reduce the multidimensional space into simple principal components that collectively measure the variance in Shakespeare's works.