Chapter 2 Data sources
The Worldwide Mobile App User Behavior Dataset contains cross-country user behavior data collected through a survey conducted by Lim et al. for their research paper, Investigating Country Differences in Mobile App User Behavior and Challenges for Software Engineering, aiming to investigate the relationship between app user behavior and the country demographic. The dataset is made public and stored in Harvard Dataverse Repository, an open data repository organized by the Harvard community.
The survey collected responses from 10,208 participants spanning more than 15 countries, including the United States of America, China, Japan, Germany, France, Brazil, the United Kingdom, Italy, Russian Federation, India, Canada, Spain, Australia, Mexico, and Republic of Korea. The questionnaire is threefold with 31 questions in total: the first part focuses on user behavior in terms of mobile app usage; the second part asks individual demographic information such as gender, age, and household income, etc; the third part is centered around user personality based on the Big Five personality traits.
A more detailed description about the attributes included in this dataset is as following:
Dataset: mobile_app_user_dataset.xlsx
Link to Harvard Dataverse Repository: https://dataverse.harvard.edu/dataset.xhtml;jsessionid=c4428c7c612c7607165aa4d0ebde?persistentId=doi%3A10.7910%2FDVN%2F27459&version=&q=&fileAccess=&fileTag=%22dataset%22&fileSortField=name&fileSortOrder=desc
ï Column Name | Description |
---|---|
ID | Unique ID for each participant |
StartDate | Date and time the participant started this survey |
EndDate | Date and time the participant completed this survey |
Response Status | Response status: incomplete/complete/screened out/bad data |
Participant Type | ours/panel |
Q1_1 | Browser type |
Q1_2 | Browser version |
Q1_3 | Browser operating system |
Q1_4 | Browser screen resolution |
Q1_5 | Browser flash version |
Q1_6 | Browser Java support |
Q1_7 | Browser user agent |
Q2 | Binary indicator of whether the respondent has a mobile device |
Q3_1 | Mobile device manufacturer |
Q3_2 | Mobile device model |
Q3_3 | Binary indicator of whether the respondent use apps |
Q4_1 | Type of app store |
Q4_2 | Binary indicator of whether the respondent knows which app store he/she is using |
Q5 | Frequency that the respondent visits app store to look for apps |
Q6 | Amount of app downloads per month |
Q7 | Under what situation does the respondent look for apps |
Q8 | Ways to find apps |
Q9 | Things to consider before downloading |
Q10 | Purpose of downloading |
Q11 | Reasons for spending money on apps |
Q12_1 | Maximum spending on an app |
Q12_2 | Name of the app that the respondent has spent the most |
Q12_3 | Reasons for downloading the app that the respondent has spent the most |
Q12_4 | Best/worst featue of the app that the respondent has spent the most |
Q12_5 | Average spending on apps per month |
Q13 | Reasons for rating apps |
Q14 | Reasons for stopping use an app |
Q15 | Type of downloaded app |
Q16 | Respondent’s gender |
Q17 | Respondent’s age |
Q18 | Respondent’s marital status |
Q19 | Respondent’s nationality |
Q20 | Respondent’s country of residence |
Q21 | Respondent’s first language |
Q22 | Respondent’s ethnicity |
Q23 | Respondent’s highest level of education |
Q24 | Respondent’s years of education |
Q25 | Wether the respondent has a disability |
Q26 | Respondent’s current employment status |
Q27 | Respondent’s occupation |
Q28 | Currency of household income |
Q29 | Annual household income |
Q30 | Personality traits |
Issue with this dataset: this survey is self-reported and therefore, might be subjected and be influenced by factors such as individual experience and cultural background. Besides, many answers provided by participants are not exact but mostly approximations. For example, users might be unaware of the exact number of apps downloaded per month and use an approximation instead.