Hi,
I was reading this documentation (http://sampleclean.org/guide/) and I see that you can use any similarity metric to find the similarity between two strings on one column attribute. Can you use multiple similarity metrics to find the similarity between two strings rather than one? If so, how can you include multiple similarity metrics?
Also, what is the matrix that is fed into SVM and RandomForest? What are the columns for this matrix. Are the values different string metrics?
Hi,
I was reading this documentation (http://sampleclean.org/guide/) and I see that you can use any similarity metric to find the similarity between two strings on one column attribute. Can you use multiple similarity metrics to find the similarity between two strings rather than one? If so, how can you include multiple similarity metrics?
Also, what is the matrix that is fed into SVM and RandomForest? What are the columns for this matrix. Are the values different string metrics?