Chapter 2 Data sources

The Worldwide Mobile App User Behavior Dataset contains cross-country user behavior data collected through a survey conducted by Lim et al. for their research paper, Investigating Country Differences in Mobile App User Behavior and Challenges for Software Engineering, aiming to investigate the relationship between app user behavior and the country demographic. The dataset is made public and stored in Harvard Dataverse Repository, an open data repository organized by the Harvard community.

The survey collected responses from 10,208 participants spanning more than 15 countries, including the United States of America, China, Japan, Germany, France, Brazil, the United Kingdom, Italy, Russian Federation, India, Canada, Spain, Australia, Mexico, and Republic of Korea. The questionnaire is threefold with 31 questions in total: the first part focuses on user behavior in terms of mobile app usage; the second part asks individual demographic information such as gender, age, and household income, etc; the third part is centered around user personality based on the Big Five personality traits.

A more detailed description about the attributes included in this dataset is as following:

Dataset: mobile_app_user_dataset.xlsx

Link to Harvard Dataverse Repository: https://dataverse.harvard.edu/dataset.xhtml;jsessionid=c4428c7c612c7607165aa4d0ebde?persistentId=doi%3A10.7910%2FDVN%2F27459&version=&q=&fileAccess=&fileTag=%22dataset%22&fileSortField=name&fileSortOrder=desc

Table 2.1: Column Description
ï Column Name Description
ID Unique ID for each participant
StartDate Date and time the participant started this survey
EndDate Date and time the participant completed this survey
Response Status Response status: incomplete/complete/screened out/bad data
Participant Type ours/panel
Q1_1 Browser type
Q1_2 Browser version
Q1_3 Browser operating system
Q1_4 Browser screen resolution
Q1_5 Browser flash version
Q1_6 Browser Java support
Q1_7 Browser user agent
Q2 Binary indicator of whether the respondent has a mobile device
Q3_1 Mobile device manufacturer
Q3_2 Mobile device model
Q3_3 Binary indicator of whether the respondent use apps
Q4_1 Type of app store
Q4_2 Binary indicator of whether the respondent knows which app store he/she is using
Q5 Frequency that the respondent visits app store to look for apps
Q6 Amount of app downloads per month
Q7 Under what situation does the respondent look for apps
Q8 Ways to find apps
Q9 Things to consider before downloading
Q10 Purpose of downloading
Q11 Reasons for spending money on apps
Q12_1 Maximum spending on an app
Q12_2 Name of the app that the respondent has spent the most
Q12_3 Reasons for downloading the app that the respondent has spent the most
Q12_4 Best/worst featue of the app that the respondent has spent the most
Q12_5 Average spending on apps per month
Q13 Reasons for rating apps
Q14 Reasons for stopping use an app
Q15 Type of downloaded app
Q16 Respondent’s gender
Q17 Respondent’s age
Q18 Respondent’s marital status
Q19 Respondent’s nationality
Q20 Respondent’s country of residence
Q21 Respondent’s first language
Q22 Respondent’s ethnicity
Q23 Respondent’s highest level of education
Q24 Respondent’s years of education
Q25 Wether the respondent has a disability
Q26 Respondent’s current employment status
Q27 Respondent’s occupation
Q28 Currency of household income
Q29 Annual household income
Q30 Personality traits

Issue with this dataset: this survey is self-reported and therefore, might be subjected and be influenced by factors such as individual experience and cultural background. Besides, many answers provided by participants are not exact but mostly approximations. For example, users might be unaware of the exact number of apps downloaded per month and use an approximation instead.