INTELLIGENT VOICE-BASED E-EDUCATION SYSTEM: A FRAMEWORK AND EVALUATION

Voice-based web e-Education is a technology-supported learning paradigm that allows phone-access of learners to e-Learning web-based applications. These applications are designed mainly for the visually impaired. They are however lacking in attributes of adaptive and reusable learning objects, which are emerging requirements for applications in these domain. This paper presents a framework for developing intelligent voice-based applications in the context of e-Education. The framework presented supports intelligent components such as adaptation and recommendation services. A prototype Intelligent Voice-based E-Education System (iVEES) was developed and subjected to test by visually impaired users. A usability study was carried out using the International Standard Organization’s (ISO) 9241-11 specification to determine the level of effectiveness, efficiency and user satisfaction. Report of our findings shows that the application is of immense benefit, based on the system’s inherent capacity for taking autonomous decision that are capable of adapting to users’ requests.


INTRODUCTION
E-education entails the use of modern information and communication technology (ICT) in teaching and learning to enhance traditional classroom education and achieve better results in distance education, webbased learning and mobile learning [1]. Over the last decade, there has been a change of focus from webbased learning systems that merely turn pages of content to systems which present learning materials in such a way as to satisfy the needs of learners. This change of focus is especially important in modern learning methods, which place strong emphasis on learners' previous knowledge. The uniqueness of a learner is met by making the web-based learning content adaptive [2].The increasing number of Learning Management Systems (LMS) for online teaching, quiz, assignment delivery, discussion forum, email, chat, et cetera, means that dynamic educational online services will be needed for efficient management of all educational resources on the web. Selecting and organizing learning resources based on learner's interested is cumbersome [3]. The process of selection may be easier with the normal users, but for certain category of learners notably the visually impaired, navigating a Voice User Interface (VUI) for the desired learning content is a strenuous task.
Web-deployed VUI applications for educational purposes provide user accessibility to content via telephone. One of the tools used for developing these applications is Voice eXtensible Mark-up Language (VoiceXML). VUI applications are primarily developed to cater for the visually impaired [4]. The major problem is that the services of reusable learning objects currently available to regular users are not available for the visually impaired [5]. The voicebased web applications mainly designed for the visually impaired lack sufficient attribute of adaptable and reusable learning objects, which is a major requirement for this category of users as a result of their physical impairment. The goal of adaptive voice-based web learning is to adjust the content of the learning objects to suite user's knowledge level, whereas the recommendation services provide the most appropriate learning objects to users. The inability of existing e-Education voice applications to meet these requirements has some far reaching implications such as limited accessibility for certain users especially the visually impaired, and usability issues as a result of lack of features for reasoning, adaptation and recommendation. A significant contribution would be to introduce the concept of reusing learner's previous experiences which is often neglected in existing voice-based e-Education systems, to make them more adaptable and provide recommendation services to users' needs. Thus, we computing@computingonline.net www.computingonline.net ISSN 1727-6209 International Journal of Computing have to take into account that the existing reference guidelines for developing these voice-based e-Education applications, although important, lack intelligent component services to approach the problem. A way of enhancing existing voice-based e-Education applications is by adapting their content to the needs of each student. It is a common belief that the performance of spoken dialogue system's could be enhanced through adaptation [6], and that adoption of artificial intelligence methodologies using casebased reasoning (CBR) to provide recommendation services to the different needs of students is essential [7], since CBR is the process of solving new problems based on the solutions of similar past problems.
From the foregone, it becomes obvious that there exists a need for providing a framework for improving the level of intelligence of voice-based e-Education applications. This paper aims at addressing the need as identified, and employing the framework so obtained for the development of a prototype Intelligent Voice-based E-Education System (iVEES) capable of improving learning processes using telephone and web-based technologies. The prototype application developed was tested in a school for the blind, and the result of evaluation reported. The proposed framework suffices as a reference model for implementing intelligent voice-based e-Education applications. Applications developed based on this framework would exhibit the necessary attributes of CBR, including adaptation and recommendation during interaction resulting in intelligence such as the ability of the system to take autonomous decision that will adapt to learners' request based on requirements. The application will therefore be helpful for people with physical access difficulties engendered by their impairment.
The research methodology adopted in this study involves a combination of CBR and Porter's stemming algorithm to realize the intelligent search agent for answers to tutorial questions. CBR is used because we desire a system that can reason from experience to save the time of teachers attending to large number of students. A system that will not only provide solution but recommend a viable solution is most appropriate for this research. The reason for using Porter Stemming Algorithm is to ensure that the suffix stripping process reduces the total number of terms in the information retrieval (IR) system, and hence reduce the size and complexity of the data in the system, which is always advantageous. A smaller data size results in a saving of storage space and processing time.
The remaining part of this paper is structured as follows: Section two discusses related work. Section three presents the proposed voice-based e-Education framework with specific emphasis on its adaptation and recommendation component services. In section four, a description of an implementation and deployment architecture of the proposed system is presented. Section five reports the system evaluation and the paper is concluded in section six.

RELATED WORK
Learning contents are increasingly stored digitally and can be accessed in voice response form. This has opened up new opportunities for voice applications in the domain of e-Education. A voice-enabled webbased absentee system was developed in [8] on TellMe voice portal [9] to include record keeping of absentee calls from students, faculty and University staff. It was tested by a class of software engineering students. A student intending to miss a class called the VoiceXML telephone number and was led through an automated dialog to have his/her data, date and time of the call, courseID, and date of intended absenteeism recorded in a database. The system provided the instructors and other administrators' permanent record of absentees that can be accessed and displayed in various forms through a web interface. The researchers in [10] explored the integration of speech recognition technologies into m-Learning applications to reduce access barriers. An educational online forum accessible through mobile devices was built, based on an m-Learning framework proposed in [11]. In the same way, the result of their experience were presented in [12] with developing an experimental prototype for speech-based literacy e-Learning and application architecture suitable for literacy based e-Learning. The authors in [4] designed, implemented and deployed a voice-enabled application called V-HELP system where a portion of the Computer Science and Engineering (CSE) department website is voice-enabled using VoiceXML allowing the access of visually impaired student population.
These applications were developed using VoiceXML to handle record keeping of learners' data including data input and output through voice response. However, they are incapable of taking independent decisions apart from responding to what the user has requested from the system. The features of content adaptation and recommendation during information retrieval (IR) were not considered as an integral part of the applications. Thus, the framework used by these applications lack the required intelligence for reusable learning objects and this is the motivation for the proposed framework.

THE VOICE-BASED E-EDUCATION FRAMEWORK
A user-centric design approach is adopted in order to develop a framework that is based on voice interaction. A survey was conducted to find out the requirements of an e-Education systems, and in particular that of the visually impaired. From the data collated from the survey questionnaire, significant differences were found between expert users, beginners and intermediate users. Expert users can do most of the typical tasks that users would normally do on the Voice User Interface (VUI) such as voice learning, participating in tutorials and checking examination results. As was expected, beginner users have little or no knowledge of some particular learning content. Based on the information gathered, the following points were identified as major requirements for the framework and application: i) Ability to provide users with services involving course registration, voice lecture, tutorial, examination and result; ii) ability to accommodate different types of users based on their respective learning profile; iii) ability to differentiate the course contents into different levels such as expert, intermediate and beginners; and iv) selection of different course contents based on users' profile. The proposed framework for iVEES is shown in Fig.1. The framework comprises of interaction and intelligent layer.

Fig. 1 -Intelligent voice-based e-Education framework
Learner's knowledge level is subdivided into three categories: beginner, intermediate and expert. A learner's profile determines whether he/she should receive beginner, intermediate or expert content of lecture, tutorial, examination and result module. The system navigation process is contained in the interaction manager. The classification as beginner, intermediate or expert level is done through content adaptation and recommendation using score allocation and result evaluation services. The intelligent information retrieval (IR) involving CBR and stemming is engaged at the recommendation services to provide recommended answers to tutorial questions using previous experience of learners, and also to expand the search for answers using stemming algorithm. Learning object consist of chunk of course material in text format which allow information to be presented in several ways to users. Tutorial questions asked by previous learners and e-Education data are stored in case knowledge and domain ontology respectively.
The adaptation component service uses the auto score allocation and result (r) evaluation model to determine the learner's knowledge level. The model is expected to be able to create the most suitable course content for each learner and control the passage from one knowledge level to another. There is within a particular knowledge level, an activity (quiz) containing at least one question. Before moving from one knowledge level to another, the system must evaluate the learner's performance through set of evaluations. The evaluation criteria are represented in Fig. 2. To be allowed to move from one knowledge level to another, the learner's result must satisfy the following transition criteria: 0.0 < =R < 0.1(default), 0.1 < R < 0.4 (beginner level), 0.4 < R < 0.7 (intermediate level), and 0.7 < R <= 1.0 (expert level).
The learner's experiences and situation were captured in the learner's profile. By using this experience, the system is able to offer to the current learner the best suited learning content. This experience was captured and provided using CBR. The recommendation services is designed using CBR, stemming and domain ontology. Our proposed system uses Biology domain ontology that represents specific knowledge, i.e., relationship between words used in Biology subject. The text tokenizer and porter stemming algorithm decomposes the whole textual information into sentences, and then into individual words with their stem for ease of retrieval of its synonyms. A Computation of the weight for every word and enhancement of term using ontology are required to improve retrieval effectiveness due to the huge amount of words.
Vector space models (VSM) is employed for performing query retrieval and assigning weights to the words in the query [13]. Two main components of the weight are term frequency and index document frequency (TF/IDF). TF/IDF estimates the importance of a term in a given document by multiplying the new term frequency (TF) of the term in a query by the term's inverse document frequency (IDF) weight. Weight of the i th term in j th query is computed using the formula in equation 4 [14]: The weighting terms W i,j method calculates the weight of each term or word from the stored cases and the inputted user's query to perform further matching. Where f i,j is the frequency of the i th term in j th query. df i is the number of queries that contain the word i. n is the total number of queries. To find the textual similarity between a stored case vector and a new case query vector, we apply cosine similarity function [15] for the textual information.

IMPLEMENTATION
A prototype application has been developed using Hypertext Preprocessor (PHP) for Web User Interface (WUI), VoiceXML for Voice User Interface (VUI), Apache for middleware and MySQL as database. The components of the application comprising design, implementation and development has been reported in [16,17]. The WUI of the application presented in Fig.  3 contain screen shot of a sample tutorial question used to demonstrate recommendation component service of the application. The WUI is developed for system administration purposes such as upload of tutorial questions and lecture notes. Voxeo speech platform [18] was engaged as the speech server while CBR and porter's stemming algorithm were used to provide intelligent services. The layered architecture presented in Fig. 4 is the generic platform for the proposed framework. It shows the location of each component in the platform and consists of the presentation tier, logic tier and data tier. The database is separated from the client by the logic-tier. In the presentation tier, users are able to connect to the e-Education application. The components of the clients' interface are i) Personal Computers (PC) for the WUI, and ii) Land phone and Mobile phone for the VUI.

Fig. 4 -Deployment architecture for the framework
The voice browser simply receives any call into the application and submits them to the speech server for further processing. The logic tier comprises of speech server, web server and application server. The broken square box in the application server represents the main contribution of this study. Once a user has been authenticated, the user's query is translated by the automated speech recognition (ASR) to text and passed to the web server for execution. The text-tospeech (TTS) does the reverse of translating text to speech. The client application interfaces with the logic tier using the ASR and TTS. The data tier provides data services and is responsible for changing, adding, or deleting information in the database within the system. The VoiceXML application was deployed on a Voxeo voice server [18] on the web and accessed through VUI from mobile or land phone using the format: <source country int. dial out #> <destination country code><destination area code><generated Voxeo voice network 7 digit #>. To connect to the application, dial: 009-1-312-3805870 or 009-1-412-5284985 from anywhere in the country.

SYSTEM EVALUATION
The application was evaluated for product usability to determine the level of effectiveness, efficiency and users' satisfaction. The evaluation of a product is a fundamental requirement in determining the practical usability of a product [19]. The usability of the e-Education application was measured to specify the features and attributes required to make the product usable using International Standard Organization (ISO) standard of usability [20].
The telephony application was tested using user survey to evaluate the usability attributes. The method proposed in [21] called PARAdigm for DIalogue System Evaluation (PARADISE) could also be use to evaluate the system. The choice for ISO's usability criteria is due to our prior experience of respondents in mobile phone usage and to compare the findings of this study with that obtained from the pilot implementation of iVEES project [17]. The users who evaluated the application are not novice users; they garnered some experience of VUI application during the pilot implementation of the project.
An overall score of all the learners was computed for each of the usability dimension by averaging all the ratings on the questionnaire that was used. With the assistance of some of the non visually impaired teachers, the respondents were taken through a short training on how to dial a telephone number from a mobile phone that will connect the learners to the application and how to navigate within the application. The average (AVG) ratings, standard deviation (SD) and variance (VAR) to determine the mean and dispersion of data collected for the usability attributes are presented in Table 1.
Several usability studies suggest that the system with "Excellent Usability" would have 5 as mean rating, "4 as Good Usability", 3 as Average Usability, "2 as Bad Usability" and "1 as Very Bad Usability". It was proposed in [22] that "Good Usability" should have a mean rating of 4 on a 1-5 scale and 5.6 on a 1-7 scale. Therefore, we can conclude that the prototype application developed for the school has "Good Usability" based on the average (AVG) total rating of 4.16.

CONCLUSION
In this paper, a framework for intelligent voicebased e-Education system has been provided. The framework was used as a generic work guideline to develop a prototype intelligent voice-based e-Education application for the iVEES project. The prototype application was tested in a school for the blind, and the result of evaluation presented. The findings show that the users are enthusiastic about using a voice-based telephone learning as another form of assistive technology to compliment the conventional learning methods for the visually impaired.
The framework would serve as a reference model for implementing telephone-based e-Education applications for normal and visually impaired learners. The application will also assist people with physical access difficulties (e.g. repetitive strain injury, arthritis, high spinal injury) that make writing difficult. It can also be effective for students with reading or spelling difficulties (e.g. dyslexia).
The future research direction, for this paper is two-fold. First, evaluation of the system using PARADISE, and second, inclusion of voice biometrics features as security mechanism to authenticate candidates, particularly for examination.