Chapter 2 Data sources

The Worldwide Mobile App User Behavior Dataset contains cross-country user behavior data collected through a survey conducted by Lim et al. for their research paper, Investigating Country Differences in Mobile App User Behavior and Challenges for Software Engineering, aiming to investigate the relationship between app user behavior and the country demographic. The dataset is made public and stored in Harvard Dataverse Repository, an open data repository organized by the Harvard community.

The survey collected responses from 10,208 participants spanning more than 15 countries, including the United States of America, China, Japan, Germany, France, Brazil, the United Kingdom, Italy, Russian Federation, India, Canada, Spain, Australia, Mexico, and Republic of Korea. The questionnaire is threefold with 31 questions in total: the first part focuses on user behavior in terms of mobile app usage; the second part asks individual demographic information such as gender, age, and household income, etc; the third part is centered around user personality based on the Big Five personality traits.

A more detailed description about the attributes included in this dataset is as following:

Dataset: mobile_app_user_dataset.xlsx

Link to Harvard Dataverse Repository: https://dataverse.harvard.edu/dataset.xhtml;jsessionid=c4428c7c612c7607165aa4d0ebde?persistentId=doi%3A10.7910%2FDVN%2F27459&version=&q=&fileAccess=&fileTag=%22dataset%22&fileSortField=name&fileSortOrder=desc

Table 2.1: Column Description
ï Column Name	Description
ID	Unique ID for each participant
StartDate	Date and time the participant started this survey
EndDate	Date and time the participant completed this survey
Response Status	Response status: incomplete/complete/screened out/bad data
Participant Type	ours/panel
Q1_1	Browser type
Q1_2	Browser version
Q1_3	Browser operating system
Q1_4	Browser screen resolution
Q1_5	Browser flash version
Q1_6	Browser Java support
Q1_7	Browser user agent
Q2	Binary indicator of whether the respondent has a mobile device
Q3_1	Mobile device manufacturer
Q3_2	Mobile device model
Q3_3	Binary indicator of whether the respondent use apps
Q4_1	Type of app store
Q4_2	Binary indicator of whether the respondent knows which app store he/she is using
Q5	Frequency that the respondent visits app store to look for apps
Q6	Amount of app downloads per month
Q7	Under what situation does the respondent look for apps
Q8	Ways to find apps
Q9	Things to consider before downloading
Q10	Purpose of downloading
Q11	Reasons for spending money on apps
Q12_1	Maximum spending on an app
Q12_2	Name of the app that the respondent has spent the most
Q12_3	Reasons for downloading the app that the respondent has spent the most
Q12_4	Best/worst featue of the app that the respondent has spent the most
Q12_5	Average spending on apps per month
Q13	Reasons for rating apps
Q14	Reasons for stopping use an app
Q15	Type of downloaded app
Q16	Respondent’s gender
Q17	Respondent’s age
Q18	Respondent’s marital status
Q19	Respondent’s nationality
Q20	Respondent’s country of residence
Q21	Respondent’s first language
Q22	Respondent’s ethnicity
Q23	Respondent’s highest level of education
Q24	Respondent’s years of education
Q25	Wether the respondent has a disability
Q26	Respondent’s current employment status
Q27	Respondent’s occupation
Q28	Currency of household income
Q29	Annual household income
Q30	Personality traits

Issue with this dataset: this survey is self-reported and therefore, might be subjected and be influenced by factors such as individual experience and cultural background. Besides, many answers provided by participants are not exact but mostly approximations. For example, users might be unaware of the exact number of apps downloaded per month and use an approximation instead.