Affinity Propagation is a clustering algorithm used in data mining and machine learning, known for identifying exemplars among data points to form clusters. Here's an overview of its workings and key features:
- Message Passing: The algorithm exchanges two types of messages between data points - "responsibility" and "availability." Responsibility messages assess a data point's suitability to be an exemplar for another point, while availability messages indicate the appropriateness for a point to choose another as its exemplar.
- Choosing Exemplars: Affinity Propagation does not require pre-determining the number of clusters. It identifies exemplars within the data, which are representative members of the input set around which clusters are formed.
- Similarity Measure: It requires a similarity matrix as input, quantifying the similarity between pairs of data points to evaluate how well each point serves as an exemplar for others.
- Iterative Process: The algorithm iteratively updates responsibility and availability messages until it identifies a set of exemplars and their corresponding clusters.
The algorithm is applied in fields like computer vision, bioinformatics, and information retrieval. It's effective in identifying clusters of various sizes, shapes, and densities without a predefined number of clusters. However, its computational intensity, particularly for large datasets, is a notable drawback due to the need for maintaining a similarity matrix for all data point pairs.