logo
ResearchBunny Logo
Introduction
The availability of extensive geolocated datasets on human movement has revolutionized the quantitative study of mobility, providing insights into diverse areas such as migration flows, traffic forecasting, urban planning, pollution mitigation, socioeconomic inequalities, and epidemic modeling. Common patterns observed include bursty activity rates, a tendency to revisit a few locations frequently, and a decreasing likelihood of exploring new locations over time. Predicting future locations based on past travel history is a particularly valuable application, with studies suggesting a high degree of predictability (70-90%). Humans are social beings, and social structures influence mobility patterns. Social relationships account for 10-30% of human movement, and information about an individual's movement can be inferred from their social ties. Previous research on online interactions showed that approximately 95% of an individual's predictive accuracy could be obtained from their social network alone. The study explores whether a person's social network can predict their future mobility patterns even without their own movement history, which could be particularly relevant for applications like pandemic mitigation and contact tracing. Location-based social networks (LBSNs) and call detail records (CDRs) offer opportunities to examine the relationship between social relations and human mobility. Spatially aggregated data can also reveal individuals from different social circles who visit similar locations (non-social colocators). The study investigates the predictive power of both social ties and non-social colocators on an individual's mobility.
Literature Review
The authors review existing literature on human mobility patterns, highlighting the predictive power of past location visits and the influence of social structures on movement. They cite studies demonstrating the scaling laws of human travel, the bursty nature of activity rates, and the tendency for individuals to exhibit preferential attachment to certain locations. Furthermore, the literature emphasizes the significant role of social ties in shaping mobility, referencing research showing a substantial portion of human movement is attributable to social interactions. The limitations of existing datasets (incomplete social networks, under-sampled location visits) are also discussed.
Methodology
The research utilizes four datasets: three publicly available LBSNs (BrightKite, Gowalla, Weeplaces) and a private CDR dataset from Rio de Janeiro. Each dataset presents unique characteristics and limitations in terms of spatial resolution, user bias, and data completeness. The researchers employ non-parametric information-theoretic estimators to analyze human mobility data. Entropy rate (S<sub>A</sub>) is used to quantify the uncertainty in predicting future locations, while predictability (Π<sub>A</sub>) represents the upper bound of an ideal predictive algorithm's accuracy. These measures consider both the frequency of location visits and temporal ordering. The study constructs both social and non-social colocation networks. Social ties are based on reported relationships in the LBSNs and inferred reciprocity in call records (at least 30 reciprocal calls per week). Non-social colocators are defined as individuals who frequently visit the same locations as the ego within a one-hour time window but have no direct social connection. Cross-entropy (S<sub>A|B</sub>) and cross-predictability (Π<sub>A|B</sub>) are used to measure the information flow from alters (social ties or non-social colocators) to the ego. The analysis also considers the cumulative information from multiple alters, generalizing cross-entropy and cross-predictability to account for sets of alters. The Overlapped Distinct Location Ratio (ODLR) and Cumulative Overlapped Distinct Location Ratio (CODLR) are introduced to quantify the overlap of unique locations visited by the ego and alters. A temporal analysis is conducted using time-displaced colocators to assess the impact of temporal lags on information transfer.
Key Findings
The analysis of entropy and predictability reveals variations across the four datasets reflecting the different contexts of app usage, spatial resolution, and population behavior. The study finds that social ties consistently provide more information about an ego's future location than non-social colocators. However, aggregating information from multiple non-social colocators significantly increases predictive power, revealing that groups of non-social colocators can provide as much information as a smaller set of social ties. For instance, in the Weeplaces dataset, the top three non-social colocators provide higher predictability than the top social tie, and the top seven colocators exceed the information content of the top two social ties. The analysis shows a positive correlation between the predictability of egos and their top alters, indicating that highly predictable egos tend to have highly predictable alters. The study demonstrates that non-redundant information exists in both social and non-social alters, even when considering the ego's past trajectory. In the Weeplaces dataset, up to 94% and 85% of the ego's predictability is contained in the social ties and non-social colocators respectively. Including the ego's past trajectory increases these percentages to around 56-57%. The Overlapped Distinct Location Ratio (ODLR) and Cumulative Overlapped Distinct Location Ratio (CODLR) show that higher-ranked alters share more unique locations with the ego, and this trend is stronger for social ties. The information transfer is strongly correlated with location overlap, regardless of the type of tie (social or non-social). Temporal analysis reveals that time-displaced colocators (individuals visiting the same location within a certain time window, even if not concurrently) still provide significant predictive information. This suggests unexpected sources of mobility information.
Discussion
The findings address the research question by demonstrating that both social and non-social ties contain predictive information about an individual's mobility. The significance of the results lies in highlighting the potential for inferring personal mobility patterns from both direct social connections and indirect colocations. This has important implications for privacy as the study reveals that individuals providing access to their location data may inadvertently reveal information about both their social circles and a much larger network of individuals they have never interacted with. The results contribute to the ongoing debate on data privacy and highlight the need for stronger access constraints on mobility information. While this data is valuable for applications like contact tracing, significant ethical concerns warrant careful consideration. Future research could explore social-economic or demographic factors influencing the predictability of mobility based on colocations, as similar demographic factors might be at play with these colocators.
Conclusion
This study demonstrates that both social ties and non-social colocators provide significant predictive information about human mobility. While social ties consistently offer more information than individual non-social colocators, aggregating information from multiple colocators can reach the predictive power of a small number of social ties. This has significant privacy implications, highlighting the need for robust data protection measures and responsible algorithmic development. Future work could focus on richer datasets to better understand the interplay of social, economic, and demographic factors on mobility predictability.
Limitations
The study acknowledges several limitations. The datasets used have inherent biases stemming from user behavior and app usage, and spatial resolution varies across datasets. Social network data may be incomplete, under-representing an individual's full social circle. Location trajectories may be under-sampled due to reliance on user check-ins. The use of observational data introduces limitations in terms of causal inference.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny