Monday, August 27, 2007

Data Mining in Social Networking Sites



This clips show a very interesting information system that this society or government have created. Through various websites such as social networking and click and mortar store to federal agency, our details have been kept to identify us. Perhaps they have more information about us that we could possibly know about ourselves.

This give an interesting perspective of data mining in CRM for organisation. Organisation now can obtain information regarding individual preferences through social networking sites such as facebook. It contains your marital status, current occupation, age, even credit card number and type. These information can be use as a starting point for data mining for organisation seek to analyse individual personalities and behaviour.

Another thing I have encounter previous week is that website that I used to register with my credit card requested my credit card number for verification after a year of cancellation of my subscription. Just wonder why would they like want to keep my credit card number after a year period? It doesn't really improve their business but I believe it only serve as a data for their CRM purpose.

All these issues bring us to what we known as customer privacy. Is it having us putting so much private information on the website has compromised our privacy or do we really getting better customer service and higher quality lifestyle as a result? -> More will be explore in following weeks blogs

Monday, August 20, 2007

Neural network Vs Decision tree

Neural network is a data mining tool that derived from the idea of human brain neural capability. In order to use it, it has to go through a learning phase where data are input into a predefined set of computational mathematics algorithm. After the learning phase, user can uses the neural network to make prediction based given any input.

In contrast, decision tree requires the user to know the input and the output of the prediction. It is easier to understand as it shaped with logical and sequential based of prediction.

Both of this technique is important for data mining but this time, I will explore which tool is better in terms of making a prediction.

First of all, neural network has it own shortcoming such as incomprehensible for most people and validating the prediction is hard. User will have no idea how the neural network come out with a certain result. However, this technique is basically categorising different data into sensible information and presented it to the user. By suggesting categorising different data, it has step into the boundary of decision tree model. This is because decision tree is also about categorising data into different categories and trying to making sense out of it. This can be seen that both techniques is actually the same. It is different in term of developing and implementing it but the general idea of how to create information for the user is the same.

My point is that regardless how many tools and techniques out there, data mining tools and techniques has not changed since it was introduced centuries ago where people doing it without computer. Essentially, data mining is always about categorising data into different categories to enable the decision maker to make the judgment. Although different tools and techniques use different algorithms, they are still supporting the same idea of categorising data. In short, there is no one superior tool or technique when it comes to data mining. It is more on reinforcing a precision of a prediction by applying different algorithm.

Tuesday, August 7, 2007

Data Mining and Knowledge Discovery

In this week lecture, the lecturer explains the techniques and development model for data mining. These issues are pretty straight forward. However, the most interesting part of the lecture is the idea of Data Mining and Knowledge Discovery. These two terms are being use interchangeably in the industry. She said that academics tend to use the term of knowledge discovery and vendors tend to use the term data mining more often. Some also said that data mining is part of knowledge discovery. I certainly agree with the latter one.

As most techniques in the data mining are not new such as decision making tree, visualisation and statistic. These techniques have been use decades, if not centuries ago by mankind. These techniques mainly aim to assist people to looks at a certain thing differently and make prediction based on previous trends. In short, it is call precedent case judgment theory. This enables people to understand the large amount of historical data by arranging those data into, for example, statistic. Thus, by applying such technique in data mining, it means that data mining is actually part of knowledge discovery.

Knowledge discovery also encapsulate more than data mining. People can discover knowledge from various sources. Before going into knowledge discovery, I will discuss what knowledge is.

Data/Artefacts + Context --> Information + General Truth --> Knowledge

I might be wrong saying how things become knowledge in such a simple diagram but let say this is how we perceived knowledge is.

As we can see, knowledge is the end product of a chain. To gain new knowledge, we must have data or artefacts that surrounds by meaningful context to become information. For instance, a number 20 would not mean anything unless it is associated with a context. Thus, we can say 20 in a classroom might means that there are 20 people in a classroom or 20 tables in a classroom. However, having information alone would not make new knowledge as information that does not associate to what other are currently exist could not put into practice. For example, 20 people in a classroom would be a piece of information. However, we would not be able to use that information unless we are told what to do or what can be done. Therefore, only information that associated with general truth can become knowledge. Therefore, we can say that knowledge would be tools or method that enables us to carry out a task appropriately.

In order to discover new knowledge, we must first discover data or artefacts. Discovering new data or artefacts can be done in different ways. For instance, walking through a park and observe what are there. Likewise, driving a golf ball and understand how far it can be hit. Some of these knowledge discovery methods are not directly associated with data mining or cannot be done in data mining. Other example that can be done using data mining would be looking at historical sales data to predict future trend. This means discovering knowledge through mathematical or systematical.

In short, data mining can be seen as part of knowledge discovery but not knowledge discovery. This is because knowledge can be gain through different methods and data mining is just one of those methods.

Monday, August 6, 2007

Making Choices

In the presentation by Malcolm Gladwell about The Paradox of Choice, it stimulates the thoughts of whether should we strive for more options or limiting the variation of a product. Essentially, this comes to the question of to what extend can increasing in choices increase the general welfare of the society.

Customer relationship management is the fundamental in trying to increase the welfare of the customer in order to earn the customer's loyalty. So it make sense that making the correct choice to varies a particular product to serve different segments of the market. However, making the correct choice is hard because of the capability of the technologies.

Lets step backward and looks at the theory that Gladwell presented. He said that by providing more choices, the customer will have a decrease in terms of economic utility. However, limiting choices have the same consequences. This is exactly what the idioms have said ages ago.

|The grass is always greener on the other side

This basically pointed out that regardless what the manufacturer decides to do, it will have contradicting consequences. Ultimately, the perfect choice will lies between the line. It means that the correct choice will be choice optimisation. By optimisation, I mean that will be depends on how the manufacturer thinks the market can be segmented and how it can be best served.

However, from the customer perspective, whenever they are making a purchase, it is about making decision. Making good decision are usually decision that are not biased. However, by limiting the choices that a customer can make, it sort of placing a biased contraints on the customer to make a good decision. In contrast, allowing free flow of choices will overwhelmed the customer will options that they might not be able to digest in time to make a wise decision. This also reflects on the thought that MICROSOFT DATA WAREHOUSING philosophy of providing managers with a report that will not overwhelmed the decision maker. However, this approach will be a biased approach because it will depends on how the developer select the information that should be available to the manager instead of the manager seeks the relevant information in the data warehouse.

In conclusion, my point of view in terms of CRM and making variation to a product to serve the society, CRM can play the center role in defining and optimising the choices that should be available. As always, the best outcome will be a balance outcome because no one likes extremism. It means limiting option to singular or making options to a very huge amount will both have bad consequences.