|
|
There is a temptation in some companies, due to departmental inertia and compartmentalization, to approach data mining haphazardly, to reinvent the wheel and duplicate effort. A cross-industry standard was clearly required that is industry neutral, tool-neutral, and application-neutral. The Cross-Industry Standard Process for Data Mining (CRISP–DM) was developed in 1996 by analysts representing DaimlerChrysler, SPSS, and NCR. CRISP provides a nonproprietary and freely available standard process for fitting data mining into the general problem-solving strategy of a business or research unit.
According to CRISP–DM, a given data mining project has a life cycle consisting of six phases, as illustrated in the Figure. Note that the phase sequence is adaptive. That is, the next phase in the sequence often depends on the outcomes associated with the preceding phase. The most significant dependencies between phases are indicated by the arrows. For example, suppose that we are in the modeling phase.
Depending on the behavior and characteristics of the model, we may have to return to the data preparation phase for further refinement before moving forward to the model evaluation phase. The iterative nature of CRISP is symbolized by the outer circle in the Figure. Often, the solution to a particular business or research problem leads to further questions of interest, which may then be attacked using the same general process as before.
Lessons learned from past projects should always be brought to bear as input into new projects. Following is an outline of each phase. Although conceivably, issues encountered during the evaluation phase can send the analyst back to any of the previous phases for amelioration, for simplicity we show only the most common loop, back to the modeling phase.
CRISP–DM: The Six Phases
1. Business understanding phase. The first phase in the CRISP–DM standard process may also be termed the research understanding phase.
a. Enunciate the project objectives and requirements clearly in terms of the business or research unit as a whole.
b. Translate these goals and restrictions into the formulation of a data mining problem definition.
c. Prepare a preliminary strategy for achieving these objectives.
2. Data understanding phase
a. Collect the data.
b. Use exploratory data analysis to familiarize yourself with the data and discover initial insights.
c. Evaluate the quality of the data.
d. If desired, select interesting subsets that may contain actionable patterns.
3. Data preparation phase
a. Prepare from the initial raw data the final data set that is to be used for all subsequent phases. This phase is very labor intensive.
b. Select the cases and variables you want to analyze and that are appropriate for your analysis.
c. Perform transformations on certain variables, if needed.
d. Clean the raw data so that it is ready for the modeling tools.
4. Modeling phase
a. Select and apply appropriate modeling techniques.
b. Calibrate model settings to optimize results.
c. Remember that often, several different techniques may be used for the same data mining problem.
d. If necessary, loop back to the data preparation phase to bring the form of the data into line with the specific requirements of a particular data mining technique.
5. Evaluation phase
a. Evaluate the one or more models delivered in the modeling phase for quality and effectiveness before deploying them for use in the field.
b. Determine whether the model in fact achieves the objectives set for it in the first phase.
c. Establish whether some important facet of the business or research problem has not been accounted for sufficiently.
d. Come to a decision regarding use of the data mining results.
6. Deployment phase
Make use of the models created: Model creation does not signify the completion of a project.
Example of a simple deployment: Generate a report.
Example of a more complex deployment: Implement a parallel data mining process in another department.
For businesses, the customer often carries out the deployment based on your model.
You can find out much more information about the CRISP–DM standard process at www.crisp-dm.org.
This article describes a series of issues that should be considered at the start of any data analysis or data mining project. It is important to define the problem in sufficient detail, in terms of both how the questions are to be answered and how the solutions will be delivered. On the basis of this information, a cross-disciplinary team should be put together to implement these objectives. A plan should outline the objectives and deliverables along with a timeline and budget to accomplish the project. This budget can form the basis for a cost/benefit analysis, linking the total cost of the project to potential savings or increased revenues.
-
Objectives: Articulating the overriding business or scientific objective of the data mining project is an important first step. Based on this objective, it is also important to specify the success criteria to be measured upon delivery. The project should be divided into a series of goals that can be achieved using available data or data acquired from other sources. These objectives and goals should be understood by everyone working on the project or having an interest in the project’s results.
-
Deliverables: Specifying exactly what is going to be delivered sets the correct expectation for the project. Examples of deliverables include a report outlining the results of the analysis or a predictive model (a mathematical model that estimates critical data) integrated within an operational system. Deliverables also identify who will use the results of the analysis and how they will be delivered. Consider criteria such as the accuracy of the predictive model, the time required to compute, or whether the predictions must be explained.
-
Roles and Responsibilities: Most data mining projects involve a cross disciplinary team that includes (1) experts in data analysis and data mining, (2) experts in the subject matter, (3) information technology professionals, and (4) representatives from the community who will make use of the analysis. Including interested parties will help overcome any potential difficulties associated with user acceptance or deployment.
-
Project Plan: An assessment should be made of the current situation, including the source and quality of the data, any other assumptions relating to the data (such as licensing restrictions or a need to protect the confidentiality of the data), any constraints connected to the project (such as software, hardware, or budget limitations), or any other issues that may be important to the final deliverables. A timetable of events should be implemented, including the different stages of the project, along with deliverables at each stage. The plan should allot time for cross-team education and progress reviews. Contingencies should be built into the plan in case unexpected events arise. The timetable can be used to generate a budget for the project. This budget, in conjunction with any anticipated financial benefits, can form the basis for a cost benefit analysis.
Summary
|
Steps
|
Details
|
|
1. Define Objectives
|
- Define the business objectives
- Define specific and measurable success criteria
- Broadly describe the problem
- Divide the problem into sub-problems that are unambiguous and hat can be solved using the available data
- Define the target population
- If the available data does not reflect a sample of the target population, generate a plan to acquire additional data
|
|
2. Define Deliverables
|
- Define the deliverables, e.g., a report, new software, business processes, etc.
- Understand any accuracy requirements
- Define any time-to-compute issues
- Define any window-of-opportunity considerations
- Detail if and how explanations should be presented
- Understand any deployment issues
|
|
3. Define Roles and Responsibilities
|
- Project leader
- Subject matter expert/business analyst
- Data analysis/data mining expert
- IT expert
- Consumer
|
|
4. Assess current situation
|
- Define data sources and locations
- List assumptions about the data
- Understand project constraints (e.g., hardware, software, personnel, etc.)
- Assess any legal, privacy or other issues relating to the presentation of the results
|
|
5. Define Time table
|
- Set aside time for education upfront
- Estimate time for the data preparation, implementation, and deployment steps
- Set aside time for reviews
- Understand risks in the timeline and develop contingency plans
|
|
6. Analyze cost/benefit
|
- List the benefits to the business of a successful project
- Compare costs and benefits
|
Reference: “Making Sense f Data – A Practical Guide to Exploratory Data analysis and Data Mining” by Glenn J. Myatt, Wiley Interscience Publishers- chapter2
It’s fine to say that a modeler builds a model, but what actually is a model? A model, in a general sense, is a replica of some other object that duplicates selected features of that larger object, but in a more convenient form. A plastic model World War II battleship, for instance, models the external appearance of the original to some reduced scale and is far more convenient for displaying in a living room than the original! Small-scale aircraft, made from a material that is much too heavy to allow them to fly, are useful for studying airflow around the aircraft in a wind tunnel. In hydrographic research, model ships sail model seas, through model waves propelled by model winds.
In some way, all models replicate some useful features of the original so that those features themselves, singled out from all other features, can be studied and manipulated. The nonphysical models that the data miner deals with are still models in the sense that they reflect useful features of the original objects in some more-convenient-to-manipulate way. These are symbolic models in which the various features are represented by symbols. Each symbol has a specific set of rules that indicate how the symbol can be manipulated with reference to other symbols. The symbolic manipulators used today are usually digital computers. The symbols consist of a mixture of mathematical and procedural structures that describe the relationships between and operations permitted on, the symbols that comprise the model itself.
Typically, data miners (and engineers, mathematicians, economists, and statisticians, too) construct models by using symbols and rules for manipulating those symbols. These models are active creations in the sense that they can be computationally manipulated to answer questions posed about the model’s behavior—and thus, by extension, about the behavior of the real world.
But data miners and statisticians differ from economists, engineers, and scientists in the way that they construct their models. And indeed, statisticians and data miners differ from each other, too. Engineers, scientists, and economists tend to form theories about the behavior of objects in the world, and then use the language of symbols to express their appreciation of the interrelationships that they propose. Manipulating the model of the proposed behavior allows them to determine how well (or badly) the proposed explanation “works.” So far as modeling goes, data miners and statisticians tend to start with fewer preconceived notions of how the world works, but to start instead with data and ask what phenomenon or phenomena the data might describe. However, even then, statisticians and data miners still have different philosophical approaches to modeling from data.
Reference: “Data Preparation for Data mining” by Dorian Pyle, Morgan Kaufmann Publishers- chapter 12.1.2
Modeling of any data set is based on five key assumptions. They are worth reviewing since if any of them do not hold, no model will reflect the real world, except by luck! The Key assumptions are
1. Measurements of features of the world represent something real about the world.
2. Some persistent relationship exists between the features measured and the measurements taken.
3. Relationships between real-world features are reflected as relationships between measurements.
4. Understanding relationships between measurements can be applied to understanding relationships between real-world features.
5. Understanding relationships between real-world features can be used to influence events.
In other words, data reflects and connects to the world so that understanding data and its relationships contributes to an understanding of the world. When building a model, the modeler must ask if, in this particular case, these assumptions hold.
Reference : “Data Preparation for Data mining” by Dorian Pyle, Morgan Kaufmann Publishers- chapter 12.1.
Document retrieval is defined as the matching of some stated user query against a set of free-text records. These records could be any type of mainly unstructured text, such as newspaper articles, real estate records or paragraphs in a manual. User queries can range from multi-sentence full descriptions of an information need to a few words.
Document retrieval is sometimes referred to as, or as a branch of, Text Retrieval. Text retrieval is a branch of information retrieval where the information is stored primarily in the form of text. Text retrieval is a critical area of study today, since it is the fundamental basis of all internet search engines.
Document retrieval systems find information to given criteria by matching text records (documents) against user queries, as opposed to expert systems that answer questions by inferring over a logical knowledge database. A document retrieval system consists of a database of documents, a classification algorithm to build a full text index, and a user interface to access the database.
A document retrieval system has two main tasks:
- Find relevant documents to user queries.
- Evaluate the matching results and sort them according to relevance, using algorithms such as PageRank.
Internet search engines are classical applications of document retrieval. The vast majority of retrieval systems currently in use range from simple Boolean systems through to systems using statistical or natural language processing techniques.
Here I have discussed some of the information retrieval frameworks which are commonly used in real world applications.
Document Based Retrieval
This is the classical text information retrieval framework which uses commonly. In document based retrieval, an information retrieval system matches the query against documents in the collection and returns a ranked list of documents to the user. The main advantage of this system is easy development and modeling.
Cluster based Retrieval
Cluster-based retrieval is based on the hypothesis that similar documents will match the same information needs. It groups the documents into clusters and returns a list of documents based on the clusters that they come from. Many approaches of cluster based retrieval are proposed in the literature. But everything comes under two main categories.
One approach to cluster-based retrieval is to retrieve one or more clusters in their entirety in response to a query. The task for the retrieval system is to match the query against clusters of documents instead of individual documents, and rank clusters based on their similarity to the query. Any document from a cluster that is ranked higher is considered more likely to be relevant than any document from a cluster ranked lower on the list. This is in contrast to most other cluster search methods that use clusters primarily as a tool to identify a subset of documents that are likely to be relevant, so that at the time of retrieval, only those documents will be matched to the query. This approach has been the most common for cluster-based retrieval.
The second approach to cluster-based retrieval is to use clusters as a form of document smoothing. Previous studies have suggested that by grouping documents into clusters, differences between representations of individual documents are, in effect, smoothed out
In most early attempts the strategy has been to build a static clustering of the entire collection in advance, independent of the user’s query , and clusters are retrieved based on how well their centroids match the query. While some studies on comparing the effectiveness of cluster-based retrieval using static clustering with that of the document-based retrieval have shown that the former has the potential of outperforming the latter for precision-oriented searches, other experimental work has suggested that document-based retrieval is generally more effective.
Taxonomy based Retrieval
Taxonomy based retrieval is similar to cluster based retrieval but the clusters are organized as a structured taxonomy. There are two ways to search a document collection organized in taxonomy. The first method is a top-down search. We begin at the root of the taxonomy and search for a specific cluster by progressively comparing the query with cluster representatives at lower levels. At the lowest level, the relevant documents are found in the most specific matching user. One risk with this method is the possibility of making a cluster matching error that leads to a wrong path in the taxonomy. A single matching error at any one of the higher levels may lead to specific cluster with irrelevant documents.
The alternate method is a bottom-up search. Queries are compared with the most specific clusters at the lowest level. The likelihood of finding irrelevant documents is the lower when we compare queries against a set of specific clusters. A high number of low-level clusters increase the computation to locate one or more relevant clusters. An inverted index of the low-level cluster representatives can speed up the search.
Yet another way of using taxonomy is to use the popular categories from a search engine hit list to rank the results of a query. Foe every query, the matching documents are grouped by category. This assumes that every document belongs to at least one or more categories. Documents from the category with the largest number of hits are ranked higher than documents from other categories.
Searches can also be made in different levels of taxonomy. For example, if a query “apple” is submitted with the relevant category /Arts/music, the searcher is possibly looking for musical devices from the company Apple. If the same query is submitted with the /Computer category, the search is interpreted as a general query for Apple Software, hardware, and other products. With a more specific category /Computers/System/Software, the query will find hits about operating systems and other system software used in Apple Computers.
Opinion mining (OM) is a recent discipline at the crossroads of information retrieval and computational linguistics which is concerned not with the topic a document is about, but with the opinion it expresses. An opinion is a private state that is not open to objective observation or verification. [Quirk et al., 1985]. Sentiment analysis, Sentiment classification, opinion extraction and other names used in literature to identify this discipline. We have three different major procedures to perform the OM. All the techniques come under these three topics. 1. Development of linguistic resources for OM, e.g., automatically builds lexicons of subjective terms. 2. Classification of text (entire documents, sentences and features) by their opinion content, e.g., classify a movie review either as Positive or Negative. 3. Extraction of opinion expression from text, eventually including relations with the rest of content, e.g., recognizes an opinion, who is expressing it, who/what is the target of the opinion. The goal of opinion mining systems is to identify such pieces of the text that express opinions (Breck et al., 2007; Konig and Brill, 2006) and then measure the polarity and strength of the expressed opinions. While intuitively the task seems straightforward, there are multiple challenges involved.
- What makes an opinion positive or negative?
- Is there an objective measure for this task?
- How can we rank opinions according to their strength?
- Can we define an objective measure for ranking opinions?
- How does the context change the polarity and strength of an opinion and how can we take the context into consideration?
OM is one of the hottest area of data mining to conduct research. One can explore lot of knowledge from the opinions and this knowledge could be used to take further useful decisions. Lot of researches are going in this area and still its having hidden treasures. Commercially lot of scopes is here to develop and market opinion based products. We are doing a research and trying to build a system from the product reviews. Just I have given the abstract of that work.
“Internet is flooded with all the information one needs while making a decision of buying a product. But only a few use this potential as it is currently a very complicated process to procure the available knowledge from the Internet and duly interpret it. Typically, Internet has a lot of websites dedicated for user product reviews. People who have used a product jots their feedback about the product, be it affirmative or negative. While making a decision, one needs to read reviews about the chosen product to collect information about pros and cons. After reading, one needs to draw decisions after making tradeoffs against various parameters like cost, durability, availability, maintenance, service, support, pride etc. We propose a system which could summarize the product reviews and draw interpretations with factual data references in a way to help an user to make decision quickly and easily. The proposed system would accept user preferences in natural language and make recommendations based on it. Our system consists of a NL Parser, Semantic Network, Ontology, Summarizer & Ranking system.”
Trying to discover innovative methods to steal the show? Here’s an answer to all your questions as to how to stand out during the meetings at your workplace. Preparation and confidence are the two key factors that you would need to distinguish yourself in a meeting. If you are well prepared and have full confidence in yourself no one can beat you to making yourself noticeable in a meeting. Here are a few tips on how you can ‘wow’ one and all in a meeting.
- Prepare for the meeting well in advance. Take notes recording your thoughts on the topics and do anything else you need to do to prepare for the meeting - whether it be reading an article or having a quick conversation with any of your team members.
- Always show up for the meeting on time or a few minutes early so that you are all set for the meeting when it begins. Making it to the meeting on time which shows your colleagues and seniors that you are very time conscious and that you value others time. This even helps you in creating a good impression in front of your team.
- Being much focused is very important when you are in a meeting, set aside all your other activities for a while and pay attention to what is happening during the meeting. Maintaining an eye to eye contact could help you to a very great extent in avoiding distractions when you are in a meeting. Keep all your personal work aside when you are in a meeting.
- Do be a very active member in the meeting. Give full participation and keep giving helpful suggestions. Be very sure that you are not to taking over the meeting. Give an equal chance to all the other members in the meeting to be a part of it. This will help in building a good rapport with them and they will get more confidence in you and your skills.
- Always try and challenge ideas in a very professional manner, be sure to add a better idea whenever you reject one. Do not attack anyone during a meeting, and never ever get into an argument.
- Confidence and positive attitude are very important during the meeting. Have confidence that you know what you are talking about and that you are adding valuable input to the meeting. Phrase your suggestions and challenges in a positive way, and encourage others to do the same during the meeting.
- Always follow up on meetings. Make a note of the things that need to be handled after the meeting and do complete them as soon as possible. If you get new ideas after the meeting, make a suggestion for a new meeting by sending a quick email or setting up an appointment with the person planning the next meeting.
- Select clearly defined problems that will yield tangible benefits.
- Pecify the required solutions.
- Define how the solution delivered is going to be used.
- Understand as much as possible about the problem and the data set (the domain).
- Let the problem drive the modeling (i.e. tool selection, data preparation, etc).
- Stipulate assumptions.
- Refine the model iteratively.
- Make the model as simple as possible - but no simpler.
- Define instability in the model (critical areas where change in output is drastically different for a small change in inputs).
- Define uncertainty in the model (critical areas and ranges in the data set where the model produces low confidence predictions/insights).
Reference : “Data Preparation for Data mining” by Dorian Pyle, Morgan Kaufmann Publishers- chapter 1.2.1.
The problem of Text categorization can be described as the classifications of text documents into multiple categories. We have a set of n categories {C1, C2, C3,….Cn} to which we assign m documents {D1, D2, Dm}.
The n categories are predefined with specific keywords that differentiate any category Ci from every other category Cj. The process of identifying these keywords is called feature extraction. Not all the keywords of a document are useful discriminators. In many cases Inverse document frequency (IDF) indexing method is used to create document representatives. A keyword was weighted higher if it occurred often in one document and rarely across the collection. Such keywords (features) are useful discriminators to isolate relevant documents during retrieval.
Some of the problems of creating clusters are common to text categorization as well. At first, text categorization appears simpler than clustering. Documents are assigned to one or more categories based on the degree of similarity with a category description. A classifier uses a similarity measure to evaluate documents against categories to find the closest category. This still leaves several questions unanswered.
· How many categories are sufficient for the collection?
· What is the maximum size for a category?
· Are categories organized in a flat hierarchical organization?
· Should documents be assigned to one or more categories?
In a dynamic collection, it is difficult to predict the contents of all documents that will be added to the collection. If we have too few categories or the description of a category is very general, then the size of a category can be excessive. At the same time, when categories are too specific, retrieval is harder without the knowledge of specific keywords, and it takes more time to find the right category. We seek balance in the specificity of a category such that a category does not become too large or too small. Achieving this balance for a dynamic collection is difficult to predict beforehand. Instead, categories are periodically adjusted to match the current state of the document collection.
For a large set of categories, it makes sense to organize categories in a hierarchy. In general, it is easier to locate relevant documents in a hierarchy than a flat list. A multi-topic document may belong to more than one category. The many-to-many relationship between documents and categories is represented in overlapping categories. The decision to assign a document to a category is usually made based on a measure of similarity with other documents or a set of features of the category. When the similarity measure exceeds a threshold, a document is included in the category. The threshold is one of the control parameters to create loose or tightly focused categories.
Text categorization can be used in applications where there is a flow of dynamic information that needs to be organized. We define dynamic information as email. News articles, blogs, scientific articles, patents, and legal data. Applications include the automatic routing of customer support requests, tagging medical claims, and tracking entities in the flow of information. This type of information is generated on a daily basis, and utilizing it is difficult without some type of categorization.
I have mentioned some of my research interests here.
Text Classification
Today’s organizations face a vast volume of knowledge and information. Most of the explicit knowledge is stored in different types of documents but only a few people (often only the authors of the documents) know where to locate them. A major approach for organizing information is to classify collected information according to a pre-defined set of classes and to retrieve relevant information by browsing the list of classes used. The enormous increase in the amount of digital information or resource available and the demand for retrieval tools to manage the information overload have lead to an interest in automatic classification task with the expectation of reducing human labor to a significant extent or even replacing in a limited portion. The objective of document classification is to reduce the detail and diversity of data and the resulting information overload by grouping similar documents together
An effective method for document is to explore content-based classification, which classifies documents based on its contents. Such a content-based classification method proceeds as follows: First, keywords and terms can be extracted by using some information retrieval and simple manual analysis techniques. Second, concept hierarchies of keywords and terms can be obtained using available term classes, as Word Net, or relying on expert knowledge, or some keyword classification systems. Documents in the training set can also be classified into class hierarchies. Some analysis method can then be applied to discover sets of associated terms that can be used to maximally distinguish one class of documents from others and used to classify new documents Lot of research areas are still to be explored in document classification. I have listed out some of them.
· Feature selection for text classification
· Semantic indexing techniques and classification models
· Automatic classification structure(taxonomy) learning for classification
· Multi-class and Multi-Label classification
· Integration of multiple sources for classification
· Classification with background information (E.g., with the help of an Ontology)
· Hierarchical classification
Text Information Retrieval
My experience in text information retrieval focuses on combining multiple resources, evidence, and criteria to incorporate domain knowledge for query expansion and result ranking. The query expansion module improves existing techniques by using several term-weighting schemes to group and combine terms from different sources based on their characteristics, which proves to be more effective than the typical approach of treating expansion terms equally. For result ranking, different scoring criteria are used to evaluate evidence from document, passage, and term-matching granularities, which are further combined to produce a final ranking. The main challenges of this work are, how to incorporate multiple models, effective techniques to query expansion, term-weighting algorithms, how to performing ranking with multiple scoring criteria.
I am planning to conduct a detail research in effective combination of information from various resources and aspects in multiple stages of retrieval for a domain-dependent application (Intelligent Resume processor, Law documents search engine, Talent management System, Medical Information Retrieval).
Domain-Dependent and Task-Specific Information Access
Specialized search provides high-quality results for domain-dependent and task-specific information access, and greatly complements general-purpose search. What intrigues me most in specialized search is its potential to incorporate knowledge about domains/tasks to better capture the characteristics of the data and users, which can lead to considerably improved performance. I am interested in working towards general frameworks to incorporate and integrate information from multiple sources such as contents, prior knowledge, and external resources. The frameworks will include more sophisticated techniques than simple forms of combination for information integration, especially for the cases where information is represented in a wide variety of forms, or implicit dependencies exist between different pieces of information. One important lesson I have learned from my previous work is that understanding the characteristics of the domain, task, and data first and developing techniques accordingly is far more important and effective than mechanically applying theoretically sound models without detailed data analysis. I will continue to use this general approach in designing the most appropriate methods for domain-dependent and task-specific solutions.
Methods developed for domain-dependent tasks often use specialized knowledge resources. In some cases these resources can be created by text-mining of large or well-organized corpora. I am mostly interested in mining entity relations that are embedded in unstructured text contents. I plan to conduct research on learning and extracting typical relational patterns between entities in specific domains (e.g., genes, diseases, symptoms, and medicines in biomedical domain) for active knowledge discovery, focusing on semi-supervised or unsupervised methods that require few training data. Furthermore, I will work on adapting and improving the techniques developed for simple entities to more complicated ones with multiple attributes or facets, which I believe will benefit many domain-dependent applications. Again, careful data analysis and attention to details goes a long way toward building the best solutions to particular problems.
Text mining techniques for Business Intelligence and CRM
Predictive Text Analytics enables one to make true, multi-channel, customer relationship management (CRM) a reality for the organization. By incorporating text mining with predictive analysis, one can get detailed models of customer behavior and preferences that can use throughout their organization. Text mining applications in CRM is an emerging field and many more models are developed in this field to explore the hidden knowledge of the potential customers and their views. I have mentioned some of the important research topics in this field.
· Analyze call center transcripts to identify customer concerns, then tie that information back to customer actions and segments.
· Predict the offers customers are most likely to accept, increasing up-selling ands cross selling results whether in person, in the call center, or online.
· Improve customer retention by determining which customer complaints are most likely to precede definition, and take action to prevent it.
· Discover what drives customer to your customer service call center and identify areas for improvement
· Identify common customer complaints from online customers by analyzing customer e-mail and instant message transcripts. Use this information to identify areas of your site that need improvements.
|
|