The overall goal is to build a "person of interest" identifier based on the public Enron financial and email dataset which was made public as a result of the scandal. Machine learning can be useful for building an identifier of this type because it can identify patterns and trends in the data that might not necessarily be apparent just by observation, and use it to predict persons of interest that might be involved in fraud. Since the Enron dataset already provided training labels of POI (person of interests) vs. non-POI's to classify the data with, a supervised training methodology is what was needed.