In the world of data analysis, comprehending complex relationships can often feel like navigating a labyrinth. Here's where Dr Gianmarco Alberti's newly released 'caplot' R package comes to the rescue, designed to streamline the analysis of various types of tabular data—from crime statistics to textual analysis and beyond—and to simplify the interpretation of the results through a unique geometric perspective on Correspondence Analysis. Let's illuminate the capabilities of 'caplot' with a couple of examples.
Figure 1
First, consider five speeches each tabulated against the ten most commonly occurring words (Fig 1).
You're curious about the main words association in each speech. 'Caplot' lets you run a Correspondence Analysis, select a reference word, and generate an informative scatterplot. For instance, the word 'government' (Fig. 2) tends to align more with speeches 1 and 2, due to their projections being closer to the 'government' point and farther from the plot's origin. Speeches 3, 4, and 5, on the other hand, are more aligned with the word 'freedom' (Fig. 3).
Figure 2
Figure 3
In another scenario, imagine a contingency table portraying crime rates across ten U.S. states, after Borg, I., & Groenen, P. J. F. (2005). Modern Multidimensional Scaling: Theory and Applications, Springer (Fig. 4).
Figure 4
You're keen to discern the most and the least frequent crimes in Massachusetts (MA). 'Caplot' empowers you to select a crime category as your reference, conduct a Correspondence Analysis, and get an insightful scatterplot (Fig. 5).
Figure 5
In the resulting plot, crimes like auto theft and robbery project nearer to the Massachusetts point, indicating a higher than average occurrence. In contrast, crimes such as murder, rape, and assault have a progressive lower frequency, as their projections lie further away.
By offering this unique geometric take on Correspondence Analysis, 'caplot' effectively translates multi-dimensional data into a readily comprehensible format. It aids in revealing hidden patterns that are otherwise elusive in raw tables. And the best part? 'Caplot' is freely available for anyone to explore on CRAN, the official repository of R packages.