Online transaction data between online visitors and
online functionalities usually convey users’ task-oriented behavior
models. Grouping online transactions might be captured
knowledge which provides information, in return, creating user
accounts, which may be associated with different navigational
models. Some future online applications like, online
recommendations or online personalized applications, the
previous related works is most important to make online users get
their preferred information accurately. We demonstrated
usability and scalability of the proposed approach through
performing experiments on two real world data sets. The practical
results have proved the method’s effectiveness in comparison with
some previous studies.
Original Title
An Effective Algorithm for Mining and Grouping Online Transactions in
Online Systems
Online transaction data between online visitors and
online functionalities usually convey users’ task-oriented behavior
models. Grouping online transactions might be captured
knowledge which provides information, in return, creating user
accounts, which may be associated with different navigational
models. Some future online applications like, online
recommendations or online personalized applications, the
previous related works is most important to make online users get
their preferred information accurately. We demonstrated
usability and scalability of the proposed approach through
performing experiments on two real world data sets. The practical
results have proved the method’s effectiveness in comparison with
some previous studies.
Online transaction data between online visitors and
online functionalities usually convey users’ task-oriented behavior
models. Grouping online transactions might be captured
knowledge which provides information, in return, creating user
accounts, which may be associated with different navigational
models. Some future online applications like, online
recommendations or online personalized applications, the
previous related works is most important to make online users get
their preferred information accurately. We demonstrated
usability and scalability of the proposed approach through
performing experiments on two real world data sets. The practical
results have proved the method’s effectiveness in comparison with
some previous studies.
An Effective Algorithm for Mining and Grouping Online Transactions in Online Systems NarasimhamParimi 1 , Mirza Mohsin Raza 2 , Prof. S.V.Achutha Rao 3
1 pursuing M.Tech(CSE), Vikas College of Engineering and Technology, Nunna, Vijayawada. JNTU-K, India 2 working as a Associate Professor in Department of CSE at Vikas College of Engineering and Technology , Nunna, Vijayawada, India. 3 working as a Professor & Head Department of CSE at Vikas College of Engineering and Technology ,Nunna, Vijayawada, India.
Abstract: Online transaction data between online visitors and online functionalities usually convey users task-oriented behavior models. Grouping online transactions might be captured knowledge which provides information, in return, creating user accounts, which may be associated with different navigational models. Some future online applications like, online recommendations or online personalized applications, the previous related works is most important to make online users get their preferred information accurately. We demonstrated usability and scalability of the proposed approach through performing experiments on two real world data sets. The practical results have proved the methods effectiveness in comparison with some previous studies.
With the popularizing and spreading of online application, now a days online has become a strong platform for, not restricting to retrieving data, and also finding knowledge, fromonline data storages. Generally, online users may show different behavior types associated with their information needs and intended tasks when they are traversing the Online. These task-oriented behaviors are explicitly characterized by sequences of clicks on different online items purchased by customers. Thus as result, those tasks are internally captured by inducing the underlying relationships among the click-streamdata. For example, image a online site designed for information about automobiles; there will be a variety of customer groups with various access interests during their visiting such an E-commerce online site. One type of customers intends to make comparison before to purchasing a customer willing to purchase specific type car of wagon, for example, would have to browse the online pages of each company, compare their offers, where like another one will just be more interested in one specific brand car, such as Ford, rather than one specific car category..In online data mining research, many data mining techniques, such as clustering is adopted widely to improve the usability and scalability of online mining. Access transaction over the online can be expressed in the two finite sets, user transaction and hyperlinks/URLs. A user transaction U is a sequence of items, this set is formed by m users and the set A is set of distinct n clicks (hyperlinks/URLs) clicked by users that are U ={t1, t2, . . . , tm} and A ={hl1, hl2, . . . , hln}, where for every ti T U is a non-empty subset of U. The temporal order of users clicks within transactions has been taken into account. A user transaction t T is represented as a vector. A well-known approach for clustering online transactions is using rough set theory De and Krishna proposed an algorithmfor clustering online transactions using rough approximation. It is based on thesimilarity of upper approximations of transactions by given any threshold. However, there are some iterations should be done to merges of two or more clusters that have the same similarity of upper approximations and didnt present how to handle the problem if there are more than one transaction under given limited value. To avoid these problems, here we are proposing an another technique for clustering online transaction. We use the concept of similarity class proposed by [11]. But, the proposed technique differs on how to allocate transaction in the same cluster and how to handle the problem if there is more than one transaction under given threshold. Generally, online mining techniques can be defined as those methods to extract so-called nuggets (or knowledge) from online data repository, such as content, linkage, usage information, by utilizing data mining tools. Among such online data, user click-stream, i.e. usage data, can be mainly utilized to capture users navigational patterns and identify user willing tasks. Once the customer moving behaviors are effectively characterized, they will provide benefits for further online applications, in turn, facilitate and improve online service quality for both online-based organizations and for end users. As a result, online usage mining recently has become one more active and hotter topic, and a variety of research communities fromdatabase management(DBMS), artificial intelligence(AI) and information systems(IS) etc., have addressed this topic and achieved great success as well [1-7]. Meanwhile, with the International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 9 Sep 2013 ISSN: 2231-2803 http://www.ijcttjournal.org Page 2997
benefits of great progress in data mining research, many data mining techniques, such as clustering association rule mining and sequential pattern mining are adopted widely to improve the usability and scalability of online mining. While online shoppers are generally well satisfied, there will be place to increase their satisfaction related to delivering and getting returns. When no fee shipping is a better impresser, getting customers back to store sites to make continues purchases and causing shoppers to recommend an online retailer, purchasers are interested to pay a trail fee for getting their product fast. When making comparison on shopping, purchasers take product price and shipping charges almost equally into consideration. There are several other things that retailers can do to improve the experience for their online customers. The first thing is to prompt the average expected delivery date of the order; customers are willing to wait for their orders but want to know just how long that will take. Timely receiving of products boosts shoppers to recommend an online seller. Purchasers are also like to know updates and delivery notifications to understand when their package is arriving.
II-RELATED WORK
In this paper, the comparisons among the proposed technique and the technique proposed by [11] are presented by given two examples, where two small data sets of transactions are considered. The first transactions data is adopted from given in Table 1 containing four objects (|U| =4) with five hyper_links (|A| = 5). The logic of implementing three important steps. The first one of three techniques is getting the measure of similarity that gives information about the users access patterns related to their common areas of interest by similarity relation between two
In this paper, we address these issues by proposing another alter- approach for clustering online transaction and generating user profile. After data preprocessing, we produce a user transaction collection and a page view corpus via user and page view identification process respectively, in turn, construct the session-page view matrix as usage data, in which each cell is expressed by a weight re presenting the contribution made by a specific page view during one user transaction. In this manner, we could map the relationships among the co-occurrence observations (i.e. user transactions) into a high-dimensional space. Moreover, an improved LSA-based clustering algorithm, named latent usage information (LUI), is proposed to find out user segments with similar behaviors effectively and precisely fromaforementioned usage data by using linear algebra theory, especially single value decomposition of matrix due to revealing deeper relationships among online transactions. The dis-covered user clusters are exploited to generate a variety of goal-oriented user profiles by calculating the centroid of corresponding cluster in the form of weighted pageview set. Experiments are conducted on two real world datasets to validate the usability and scalability of usage mining. Meanwhile, an evaluation metric is adopted to assess the quality of discovered clusters, and comparisons are made with some previous work as well. The experimental results have shown that the proposed approach is capable of effectively discovering user access pattern and revealing the underlying relationships among user visiting records.
III-CLUSTRING ONLINE TRANSACTION
Here we adopt a modified standard K-means clustering algorithm, named MK-means clustering, to classify user session based on the transformed SP matrix over the latent k- dimensional space. This algorithmdoes not need to predefine value k and k initial centroids, whereas the standard k-means has to do so to start clustering. The algorithmis described as follows:
Algorithm: MK-means clustering Input: usage data SP and similarity threshold 1. Choose the first user session s1 as the initial cluster C1 and centroid of this cluster, i.e. C1={s1} and Cid1=s1.
2. For each session si, calculate the similarity between si and the centroids of other existing cluster sim(si,Cidj).
3. if ''(,)max((,))ikijjsimsCidsimsCid= > , then allocate si into Ck and re-calculate the centroid of cluster Ck as '1kkkjCCidCs=j ;
4. Otherwise, let si itself construct a new cluster and be the centroid of this cluster.
5. Repeat step 2 to 4 u
Output: cluster set CS={C k }
Shopping Experience and Satisfaction Consumer satisfaction with online shopping overall is high, at 86%. Online shoppers are most satisfied with ease of check- out (83%), variety of brands/products (82%), and online tracking ability (79%). Online shoppers are least satisfied with feasibility of shipping, in addition to flexibility to select delivery date (58%) and re-route packages (57%), and the ease of making returns and exchanges (65%). In addition to ease of making returns and inter changes, there is a chance to increase purchaser satisfaction by having a clear return policies. Logical International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 9 Sep 2013 ISSN: 2231-2803 http://www.ijcttjournal.org Page 2998
services may directly effect 6 out of 11 of the aspects that influence a customers shopping experience. For retailers looking to increase customer satisfaction, it is important to look not only at how satisfied users are with various aspects of the online shopping experience, but also how important this factors are. To make this, a one-fourth analysis was performed, mapping formulized importance of each factor versus the comfort percentage. Items in the upper-right quadrant are those with both high importance and high satisfaction. Due to high priority, it is importent for retailers to continue to maintain high levels of satisfaction on these elements ease of checking over, different types of brands and products featured, and the ability to create an account to store purchase history and personal information. The factors in the bottom half of the chart are of lower importance in driving overall online shopping satisfaction. While frequently cited by consumers as a compulsory, free of cost or offered shipping is actually less important in driving overall satisfaction than those factors mentioned above, specifically easy of checking out and different of brands and products specified. The upper left quadrant of the chart contains the factors driving satisfaction that are highly important but currently have low satisfaction. These factors a clear and easy to understand returns policy and ease of making returns and exchanges should be areas of focus for retailers looking to increase their overall customer satisfaction.
Comparison Shopping While it is important to look at what motivates customers to return to a retailer, it is also important to look at what factors are taken into consideration when current or prospective shoppers are comparison shopping. When comparison shopping, consumers take product price and shipping charges almost equally into consideration. The result buying decision might be that the shopper chooses to buy froma retailer who does not offer free or discounted shipping if the total price including shipping is less than that of a retailer offering free or offered shipping. Itemprice and delivery rates were rated as the most important factors in comparison purchasing. Shipping speed, purchaser reviews, retail purchaser brand, and delivery time feasibility are all taken into account by consumers when comparison purchasing, but on the low rate than itemprice and shipping charges.
Retailer Recommendation
In addition to retaining satisfied customers and attracting those who are comparison shopping, another way retailers can increase their business is through the recommendations of current customers. When asked what would lead or has led to a recommendation of a retail purchaser, the availability of free of cost delivery or offer shipping is the main factor. Exact timing arrival of products and free or easy returns rate as the next important factors that prompt shoppers to recommend the online retailer. Since 41% of shoppers said receiving my product when expected led them to suggested a retail purchaser, both making communication about delivery time and reliable delivery are critical aspects to a positive customer experience. The current study focuses on three determinants that could influence the impact of computer-mediated recommendations on consumers online product choices: the nature of the product recommended, the nature of the online site on which the recommendation is proposed, and the type of recommendation source. Prior research has shown that the type of product affects consumers use of personal information sources and their influence on consumers choices suggests that goo ds can be classified as possessing either search or experience qualities. Search qualities are those that the consumer can determine by inspection before to purchase, and expected features are those that are not determined prior to purchase. Because it is complicated or may be impossible to evaluate experience products before purchase, consumers should rely more on product recommendations for these products than for search products. In support of this view, they found that consumers assessing a search product (e.g., a 35-mm camera) are more likely to use own-based decision-making processes than consumers assessing an experience product, and that consumers evaluating an experience product (e.g., a film- processing service) rely more on other-based and hybrid decision-making processes than consumers assessing a search product. The nature of the online site can also influence the impact of a given recommendation. Based on previous online site classifications suggest that recommendation sources can be used and promoted by three different types of online sites: sellers (e.g., retailer or manufacturer online sites such as Amazon.com), commercially linked third parties (e.g., comparison shopping online sites such as MyShine.com), and non-commercially linked third parties (e.g., product or merchant assessment online sites such as Consumerreports.org). More independent online sites such as non-commercially linked third parties that facilitate consumers external search effort by decreasing search costs are assumed to be preferred by consumers (Alba et al., 1997; Bakos, 1997; Lynch & Ariely, 2000). By providing more alternatives to choose from and more objective information, independent online sites should be perceived as more useful by consumers. In addition, prior research on attribution theory suggests that consumers discredit recommendations fromendorsers if they suspect that the latter have incentives to recommend a product (for reviews, refer to Folkes, 1988; Mizerski, Golden, & Kernan, 1979). According to the discounting principle of the attribution theory (Kelley, 1973), which suggests that a communicator will be perceived as biased if the recipient can infer that the message can be attributed to personal or situational causes, consumers would attribute more non- product related motivations (e.g., commissions on sales) to recommendation sources that are promoted by commercially linked third parties and sellers than independent third party onlinesites. Consequently consumers would follow product comparision shopping product price shipping charges peer review International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 9 Sep 2013 ISSN: 2231-2803 http://www.ijcttjournal.org Page 2999
recommendations in a greater proportion when shopping on more independent than on less independent onlinesites. In light of research on consumers use of relevant others in their pre- purchase external search efforts (Olshavsky & Granbois, 1979; Price & Feick, 1984; Rosen & Olshavs ky,1987) and in consideration of the emergence of online information sources providing personalized recommendations (Ansari et al., 2000), Senecal and Nantel (2002) assert that online recommendation sources can be sorted into three broad categories: (1) other consumers (e.g., relatives (2)human experts (e.g., salespersons, independent experts), and (3) expert systems such as recommender systems. We posit that these online recommendation sources will have different levels of influence on consumers online product selection. It suggest that information received from sources that have some personal knowledge about the consumer have more influence on the latter than sources that have no personal knowledge about the consumer. Thus, a recommendation source providing personalized information to consumers (e.g., recommender system) should be more influential than a recommendation source providing non-personalized information (e.g., other consumers).
Results of generated user profiles We utilize aforementioned LUI method to classify user transactions. For compari-son purpose, we also performPACT approach based on standard K-means used in [9] to generate user profiles. From the results, it is found that generated profiles are overlapping of page views since some page views are listed in more than one user clusters. Table 1 depicts 2 user profiles generated from KDD dataset using LUI approach. Each user profile is listed in a ordered page views sequence with weights, which means the greater weight of a page view contribute, the more likely it is to be visited. The first profile in Table 1 represents the activities involved in online-shopping circumstance such as login, shopping_cart, and checkout etc., especially occur-ring in purchasing leg-wear products, whereas second user profile reflects customers concern focused on the interests with regard to the department store itself. Analogously, some informative finding can be obtained in Table 2, which is de-rived fromCTI dataset. In this table, three profiles are generated: the first one reflects the main topic of international student concerning issues regarding applying for ad-mission, and second one involves in the online applying process for graduation, whereas the final one indicates the most common activities happened during students browsing the university onlinesite, especially while they are determining course selec-tion, i.e. selecting course, searching syllabus list, and then going through specific syl-labus. Pageview # Pageview content weight 29 Main-shopping_cart 1.00 4 Products-product Detailleagwear 0.86 27 Main-Login2 0.67 8 Main-home 0.53 44 Check-expressCheckout 0.38 65 Main-welcome 0.33 32 Main-registration 0.32 45 Checkout-confirm_order 0.26
Delivery Timing As seen above, 60% of online shoppers say that an estimated or guaranteed delivery date is important at check-out. Because online shoppers have a range of time they are willing to wait for the delivery of their orders, retailers that offer a range of delivery time options allow themselves to appeal to a wider range of customers. While 48% of customers stated that they are not willing to wait more than 5 days for most of their purchases, 23% said that they would be willing to wait 8 days or more. Just over 40% of online shoppers indicated that they have abandoned their shopping cart because of an issue with the estimated delivery time. Of web customers that they have removed their cart because of expected delivery date, a one- fourth indicated that no expected delivery was made. In those which were shown an expected delivery date and abandoned their cart, 64% of the time the estimated delivery time was 5 days or more than that. Performing expected delivery date is a Good win for retailers who are not currently doing so.
IV. CONCLUSION
We mapped the relationships among the co-occurrence observations (i.e. user transactions) into a high-dimensional space to construct the usage data in the formof session-page view matrix. Then a dimension reducing algorithm(i.e. single value decomposition) was employed on the usage matrix to capture the latent usage information for partitioning user transaction. Based on the decomposed latent usage information, we proposed a modified k-means clustering algorithm to generate user session clusters. Moreover, the discovered user groups are utilized to construct user profiles expressed in the formof a weighted page view collection, which represents the common usage pattern associated with one kind of specific visitors access interests. The constructed user profiles corresponding to various task-oriented behaviors are represented as a set of page view-weight pair collection, which each weight represents the identity contributed by the page. Experiments are conducted on two real world datasets to validate the usability and scalability of usage mining. Meanwhile, an evaluation metric is adopted to assess the quality of discovered clusters, and comparisons are made with some previous works as well. The experimental results have shown that the proposed approach is capable of effectively International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 9 Sep 2013 ISSN: 2231-2803 http://www.ijcttjournal.org Page 3000
discovering user access pattern and revealing the underlying relationships among user visiting records as well. The future works will be focused on the research issues, such as performing experiments over more datasets, broadening comparison and make use of discovered user profiles for further online application, for example, online recommendation and personalization.
REFERENCES
Eytan Adar. User 4xxxxx9: Anonymizing query logs.
Roberto Baeza-Yates. Online usage mining in search engines. OnlineMining: Applications and Techniques, 2004. B. Barak, K. Chaudhuri, C. Dwork, S. Kale, F. McSherry, and K. Talwar. Privacy and accuracy and consistency too A holistic solution to contingency table.
Michael Barbaro and TomZeller. A face is exposed for searcher. New York Times http://www.nytimes.com/2006/08/09/technology/09ao l.html?ex=1312776000en=f6f61949c6da4d38ei=5090, 2006.
Avrim Blum, Katrina Ligett, and Aaron Roth. A learning theoritical approach to non interactive database privacy. In STOC, 2008.
J ustin Brickell and VitalyShmatikov. The cost of privacy, destruction of data mining utility in anonymized data publishing. In KDD, 2008.
AUTHORS PROFILE
Parimi Narasimham, Pursuing M.Tech (CSE) Vikas College of Engineering and Technology (VCET), Nunna, Vijayawada. JNTU-K, India
Mirza Mohsin Raza, is working as aAsst. Professor of CSE department at Vikas College of Engineering and Technology (VCET), Nunna, Vijayawada(Dist), JNTU-K, A.P, India
Prof S.V.Achutha Rao, is working as aHOD of CSE at Vikas College of Engineering and Technlogy (VCET), Nunna, Vijayawada, JNTU-K, India