I started with TF-IDF/LSA + KMeans baselines, moved to POS-filtered representations and then switched to BERT embeddings plus tag features. On the clustering side, I experimented with KMeans, spectral ...