MOTIVATION OF PARTICIPANTS IN CROWDSOURCING PLATFORMS USING INTELLIGENT AGENTS

Crowdsourcing is a model where individuals or organizations receive services from a large group of Internet users including ideas, finances, completing a complex task, etc. Several crowdsourcing websites have failed due to lack of user participation; hence, the success of crowdsourcing platforms is manifested by the mass of user participation. However, an issue of motivating users to participate in crowdsourcing platform stays challenging. We have proposed a new approach, i.e., reinforcement learning-based gamification method to motivate users. Gamification has been a practical approach to engaging users in many fields, but still, it needs an improvement in the Crowdsourcing platform. In this paper, the gamification approach is strengthened by a reinforcement learning algorithm. We have created an intelligent agent using the Reinforcement learning algorithm (Q-learning). This agent suggests an optimal action plan that yields maximum reward points to the users for their active participation in the Crowdsourcing application. Also, its performance is compared with the SARSA algorithm (On- policy learning), which is another Reinforcement learning algorithm.


INTRODUCTION
Recently, crowdsourcing has emerged as a cutting-edge problem-solving platform for the business where many people are involved in solving the complex problems that machines would not solve. Crowdsourcing is the process of resolving or solving a complex problem by many people (and mostly online). The advantage of using crowdsourcing in business is that millions of people with diversified knowledge can share their inputs or expertise to solve a complex problem, and some of these solutions provided are found better than an expert's answer in some cases. Crowdsourcing platform is active with a different focus in many fields such as Amazon Mechanical Turk, Wikipedia, Threadless.com, IStockPhoto, and Galaxy zoo, etc.
Crowdsourcing works excellent on the participation of people's count to solve the problem. However, many crowdsourcing projects are distorted due to a lack of people's involvement [1]. Hence, motivating people to participate in a crowdsourcing platform is the primary challenge in using it. The basic idea of motivation is to attract people to do any action or task. There are two types of motivations used in crowdsourcing platforms, i.e., intrinsic motivation and extrinsic motivation.
In Intrinsic motivation, an individual gets motivated internally, i.e., just by doing the task without expecting any external reward. In the case of extrinsic motivation, an individual anticipates something externally to do the work like reputation, money, etc. [2]. Jakob Nielsen conducted a study on online communities and concluded that only a small computing@computingonline.net www.computingonline.net

Print ISSN 1727-6209 On-line ISSN 2312-5381
International Journal of Computing fraction of users are involved in contributing to such platforms. The majority of the users do not show a lot of interest [3]. It becomes beneficial that the motivators or incentives to participation need to be deeply understood [4].
Many researchers have studied the consequences of intrinsic motivation and extrinsic motivation in the crowdsourcing platform [5], [6], [7], [8] and found that money plays a significant role in attracting individuals at the beginning. Some researchers have analyzed the famous crowdsourcing platform Amazon Mechanical Turk and concluded that the quality of work submitted by participants has not increased because of high monetary reward, and intrinsic motivation helps to improve the quality of work [2]. As there is no standard pricing structure, it rather spoils the relationship between the requester and the worker. Further, the high budget tasks are more lucrative to many workers than the low budget tasks [9]. It was observed that typically, financial rewards degrade the performance of a worker when put next to no reward system [10]. Etzioni [11] concludes that the monetary reward is not good in every situation and also spoils the intrinsic motivation of the workers. Hence, in this research, we have proposed gamification techniques to motivate the people in the crowdsourcing platform.
Gamification technique is a way to attract people and improve their participation. The use of gamification may keep away from the circumstance where a publicly supporting task may fail due to fewer people participating. Henceforth, how to apply gamification properly to build support and commitment has turned into a recent research topic [12].
Generally, the use of game elements in nongame context to encourage desired behaviors of participants is called gamification. In a learning system, game elements such as points, levels, badges, etc. are generally used as incentives. Hence the primary aspect of gamification is rewarding, which provides extrinsic motivation to the user. Moreover, extrinsic motivation is not necessarily considered in financial terms, and it can be nonfinancial also. The problem with extrinsic motivation is that because most of the platforms provide less amount of money than remuneration after completing the work. Hence, always people who participate in crowdsourcing work will doubt whether their work will be accepted or rejected, saying that it is an unsatisfactory work. And some crowdsourcing sites even do not provide anything, so how to motivate people to participate in crowdsourcing platform is a primary challenge in the crowdsourcing platform. We have attempted to address this problem in this work.
The rest of the paper is made up of the following sections. Section 2 consists of related work which discusses some of the recent works in this area carried out by other researchers. Section 3 provides the background required to help understand the next section. Section 4 describes the proposed system. In section 5, we describe the experimentation details and the results obtained. Section 6 concludes our work.

RELATED WORK
Gamification has been used by scholars and practitioners in different contexts and different fields such as marketing, social networking, and learning.
J. Goncalves used crowdsourcing and gamification to create the keyword dictionary to describe locations on public displays [4]. In [13], the image-labeling game called, Wordsmith was designed for the experimental game conditions. The gamification techniques are used to attract and retain many reliable workers for crowdsourcing tasks like relevance assessments and clustering [14]. Another researcher explored the possibility of engaging a secondary group of millennials, who are notorious technology enthusiasts, with a gamified citizen science application named as Bio tracker, which is used to collect the plant phenology data [15].
Since the number of employees who typically reply to survey requests is generally low, hence, Smith and Kilty used game elements along with crowdsourcing to encourage the employees to respond to online enterprise meetings about software quality survey requests [16]. These days, gamification is also being utilized in requirement engineering in many software companies [17].
In [18], a large amount of data is collected from a heterogeneous population for the study of touchscreen operation in natural environments using the gamification technique. A gamified crowdsourcing system named as Quizz is developed, which is used to assess the knowledge of users and gain new insights from them. Quizz works by requesting that the clients' complete short tests on specific subjects when the client addresses the test questions, Quizz evaluates the client's ability [19].
B. Morschheuser et al. explored how unique gamification techniques build crowd's inspiration and participation in crowdsourcing tasks. His empirical study proved that the gamification had been a practical approach for expanding the people's engagement in crowdsourcing [20]. Another researcher presented G.A.M.E., which is a structure to manage the plan of gamification in crowdsourcing based frameworks. The structure gave an adaptable bit-by-bit rule that consolidated information from software engineering, shared courses of action, game plan, and communication structure [21]. Microsoft created a code hunt. It is a web-based gaming platform for the participation of people to do crowdsourcing tasks [22].
The above study indicates that gamification is used in many applications in various domains, but still needs improvement. In this paper, we have used the Reinforcement learning algorithm based on intelligent agents to fortify the gamification technique. The reinforcement learning algorithm has been already used in crowdsourcing for various purposes, like in task assignment and incentive design [23], [24], [25], [26]. Some researchers and physiologists tried to understand the intrinsic motivation based on behavioral theory and tried to create an intrinsic motivation model. In [27], the authors explained that intrinsic motivation is used as an internal reward mechanism that goes well with a simple reinforcement learning algorithm in cognitive computing. Both the internal reward and Reinforcement learning algorithm can be used in an autonomous learning system. The simplest Reinforcement learning is based on the idea of Thorndike's law of effect [28]. If any activity is trailed by an improvement or fulfillment in the state of activities, at that point, the inclination to deliver that activity is fortified. Redgrave and Gurney explained the relationship between intrinsic motivations and Dopamine [29]. Dopamine is a neuron that activates the brain to generate the learning signal. Like this, many researchers have worked on the problem of implementing intrinsic motivation in terms of reinforcement learning algorithms [30], [31], [32], [33].
It is clear from the above study that there is a strong relationship between intrinsic motivation and Reinforcement learning. The reinforcement learning gives predictions based on intrinsic motivation, which maximizes the interest by providing data with reduced subjective complexity and is based on neuroscientific theory on action discovery and learning. Intrinsic motivation learning signals will be released by the brain during the prediction of future states based on current states. This concept is well suited to the motivation of user participation in crowdsourcing activities.
Hence, we have created an intelligent agent using Reinforcement learning and gamification technique. The reinforcement learning algorithm gives the motivation signal during the prediction of future action, and gamification technique gives virtual rewards to motivate the engagement of the user in crowdsourcing activities externally.

CONCEPTUAL FOUNDATION
In this section, the definition of the gamification technique and reinforcement learning algorithm is explained briefly.

GAMIFICATION TECHNIQUE
In a non-gaming context, the utilization of game components and game mechanics is called the gamification technique [34]. In Werbach and Hunter [35] pyramid, game elements are organized in three classes: dynamics, mechanics, and components. The elements in dynamics group are: progression, emotions, constraints, and relationships. The element in mechanics drives user engagement with content such as feedback, challenge, cooperation, and competition. The tools used to motivate users in the environment of interest are components including achievement, badge, combat, leaderboard, and level. In our research, points and golden reward are used as a game element to motivate the user to participate in the crowdsourcing platform.

REINFORCEMENT LEARNING ALGORITHM
Markov decision process is a simple and progressive decision-making process under uncertainty circumstances [36]. The reinforcement learning frame work is similar to Markov decision process. Like the Markov decision process, the reinforcement learning algorithm consists of states, actions, and rewards in an uncertain environment. The responses received from the environment are represented by states, for example, contributions from the agents. The activities involved by an agent in the environment is represented by the actions. Responses by the actions create reward or punishment [37]. The learning happens through positive feedback or negative feedback in the reinforcement learning algorithm. The reinforcement learning algorithm allows agents to take decision on the perfect actions naturally inside a particular environment, so as to increase its reward. The reinforcement signal, the simple reward, is required for the agent to learn the actions. One of the important reinforcement learning algorithms is the Q-learning algorithm. We have used the Q-learning algorithm to create the intelligent agent.

Q FUNCTION
As Watkins [38] mentioned, we can define Reward and policy as below.
Reward: In Markov Decision Process (MDP), the movement of agent from one state to another state depends on the current state St and reward Rt. Since the environment is stochastic, the total reward Rt received from the current time step t to the end of the task can be defined in terms of discounted reward.
where 0< ϒ <1 is the discount factor. Policy: Policy π is defined as the probability of selecting the best action for the given state. It is mathematically described as: There are two types of value functions, i.e., state value function and action value function. State value function V (s) is defined as long term reward of s after following policy . It is described as: Action value function q (s, a) is defined as the expected reward for the given action (a) at a particular state(s) by following the policy π.  Where, Q( s , a ), Q( s , a ) are new and old Q-values for the action (a) at state (s). Here, α and γ are the learning rates, and discount factor and the value remains in the range (0, 1) interval, where r is the reward for carrying out the action (a) in the state(s).

PROPOSED SYSTEM
We have used the academic domain as a case study to show the performance of the proposed method. It is very well fit with our research scenario as the university/colleges consist of many people like students, professors, and administrative staff, etc. Normally the problems in an academic domain like colleges and universities are solved by arranging experts or consultants or collecting feedback from people. In our research, we show that crowdsourcing is a more suitable technique for solving problems in the academic domain; hence, we have created a faculty network, a social media application, for professors to share their educational and research experience. This faculty network acts as a platform for collaborative tasks. Faculty network holds various modules like forums, articles, blogs, and a task manager. The task manager module, as shown in Fig.1 of the faculty network, is used for crowdsourcing activities like requester can post the task, and workers can apply for solving the task. answers, blogs, and articles), he or she can vote it up, or if he or she doesn't like it, they can vote it down. All states and activities will be attempted persistently. Lastly, the best activity and reward will be selected.
The Q-learning algorithm works as follows: there is a Q table that will be updated by the state, corresponding actions, and rewards for each iteration. At every time period t, the agent selects its state s ϵ S from the set of states and picks an action a ϵ A from the set of actions and a reward R dependent on the up vote and down vote in a particular state. Activities are chosen based on a policy π: S → A that is changed after some time as the agent attempts different activities and gathers proportionate rewards after numerous attempts, the total reward will be increased using the below reward function: S × A → R.

Figure 2 -Proposed System
Consider the states S = {s0=0-50, s1=51-100, s2=101-150……sk =951-1000} and actions A= {post the question Q, post the answer An, post the blog content B or Article Ar and post the task T} and also assume that initially the reward R = 1 and then based on up votes (Uv) and Downvotes (Dv) the reward R is calculated by the formula specified in Algo. 1.

Q-LEARNING ALGORITHM
In the Q-learning algorithm, the agent walks around from one state to another state until it touches the goal state and thus converges. The goal state is the last (1000 points), as mentioned in the above paragraph. The faculty receives the golden reward at the goal state.
The state (St) and action (At) pairs are updated in Q-table at each time step until it converges. Now, the Q-table shows the optimal action for each state. The steps used in the Q-learning algorithm are shown in Algorithm 1. Initially, the values of reward (R) is 0. Then, the reward is observed for the actions performed at the particular state (S). The action that has maximum (Q) value is selected.

EXPERIMENTAL SETUP AND RESULTS
As mentioned in Section 3, the faculty network has been implemented to test our motivation algorithm. Around 650 faculties were requested to register in the faculty network from our university. It is a social media-based crowdsourcing platform. In this network, faculties can share their academic and research experiences and can perform crowdsourcing activities too. The advantages of using social media crowdsourcing platform is to improve intrinsic motivation like peer review, connections with the experts, etc.
We have formed an intelligent agent using Q learning. This intelligent agent, which is used to motivate the faculties to participate in crowdsourcing activities, is attached to our Faculty network. The intelligent agent works as follows: It consists of a Q Here epsilon-greedy policy is used to select the actions which work based on exploration and exploitation. We would like it to take random action to discover further actions as an agent starts learning. Once the agent learns, the Q-function converges to Q-values. Now our agents take advantage of the highest Q-value i.e., takes greedy actions. During the exploration phase, it tries all the possible states and actions, and after completing enough number of iterations, it learns the correct action for a particular state. Agent selects random action for probability ꜫ and greedy action for probability (1-Ꜫ) [39]. It is necessary. Otherwise, it stuck in local optimum, and will never find an improved policy.
The performance of the Q-learning algorithm is presented in Fig. 3 and Fig. 4. From the figure, one can realize that our algorithm converges after some iterations and provides the optimal action for each state. Also, we have shown the sample policy in Table 2. This model informs the faculty on which actions give maximum reward and reach the goal state from any state easily. The goal is to achieve1000 points. Once the faculty receives 1000 reward points, he or she is eligible to get a golden reward and some privileges. This strengthens the gamification technique. In our experiment, we have also matched the performance of the Q-learning algorithm with another famous reinforcement learning algorithm [40]. Fig. 5 shows the convergence of an intelligent agent using the SARSA algorithm. The Q-learning is an offline policy, and the SARSA algorithm is an online policy algorithm. The main difference between these two algorithms is how the Q-value is updated. In SARSA, learning is done based on the action performed by the current policy, but in Qlearning, the action with maximum reward is chosen.
We understand that both algorithms converge after some iterations. The Sarsa algorithm converges faster than the Q-learning algorithm. Though it converges fast, it does not perform well, i.e., cannot find optimal path planning.
Although both converge at optimal policy, Qlearning is suitable for low cost and the fast iterating environment as it learns optimal policy directly. We have attached the intelligent agent to the faculty network and observed the policy obtained at each state. When a faculty login to the system, it suggests some action to do, which yields maximum reward based on this table. For example, if the faculty is in state S2, the system will indicate the faculty to do action "post an answer." Further, we have checked the response rate of the Q-learning based intelligent agent on faculty network and observed that the number of responses for our approach increased than the cash prize which is shown in Fig. 6. Post an article(Ar) S1

_____________________________________
Post a task(T) S2 Post an answer(An) S3 Post an answer(An) S4 Post an article(Ar) S5 Post a task(T) S6 Post an blog(B) S7 Post an article(Ar) S8 Post a Question(Q) S9 Post an blog(B) S10 Post an answer(An) S11 Post a task(T) S12 Post an article(Ar) S13 Post a task(T) S14 Post an answer(An) S15 Post a task(T) S16 Post an article(Ar) S17 Post a task(T) S18 Post an article(Ar) S19 Post an answer(An) The screenshot of the reward details of one faculty on Faculty network is shown in the Fig. 7.

CONCLUSIONS AND FUTURE WORK
The proposed technique for the motivation of user participation in the crowdsourcing platform uses both the gamification element and Q learning algorithm. This is more effective than simply using gamification elements as this framework boosts the intrinsic motivation of the people to engage in crowdsourcing activities.
Further, Gamification with Reinforcement learning can make a customized understanding for users in crowdsourcing platform that will not just keep them progressively connected yet offer the open door for persistent learning and to successfully change their conduct when required. The limitation of our algorithm is, here, just for simplicity, we have considered the goal state for an agent is to reach 1000 points. At this goal state, the user will receive a golden reward. But in the future, we would like to improve our system by exploring more on rewards and privileges once the faculties received a golden reward. The problem with Q learning is how to balance the exploration and exploitation dilemma, increase the convergence rate, and abstain from converging to a local optimum.

ACKNOLWEDGEMENT
We thank our off-campus faculty colleagues who have volunteered to participate in the social media application deployed in the faculty network, giving us an opportunity to gather real world datasets in an academic environment.