[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Savannah-register-public] [task #6719] Submission of massive data cube

From: edward yoon
Subject: [Savannah-register-public] [task #6719] Submission of massive data cube housing
Date: Mon, 16 Apr 2007 01:48:25 +0000
User-agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; InfoPath.1)

Follow-up Comment #6, task #6719 (project administration):


Macho is a multi-dimensional, sparse map storage with its focus on DFS's
massive data storage 
and easier data analysis and development. 
It could also be defined as a distributed database that is more economical
than traditional large databases 
that allows faster analysis on more diverse data. 
It does not manage every pre-calculation 
but it stores data in a distributed way with a structure that allows
distributed computation. 

Why do we need it? 
The amount of data is enormous and it grows exponentially. 
On top of the simple storage needs, we would like to do some data analysis as
We want our DB to be light-weight. 
We want our DB to adopt to the ever-changing needs and requirements of new

Conclusion : We want to extract more value out of a company’s data by
providing more availability and usability when the company’s needs arise. 

An usage example of Macho – User action log data table for a service 
To help make a business decision, 
to find a way to meet the need of each customer, 
or to find a product or a market that will bring big profits, 
we group together action logs of users and create a User Table like the one

row [ user ], attribute columns [ search history, item buying log, post scrap
log, Page Viewing log, User neighborhood (blog), User active part (cafe) ] 

If we select two columns, the fact table in the above schema can be
represented in a two-dimensional table. 
(Analysis Framework) 

Who referred to document A?. What other documents do they also like?. 
What does a user who actively participates in a online community X like to
Who are the neighbors of this blog’s author?. What are social distances
between them? 
By finding out where new markets are being formed by managing and analyzing
those user-related data, 
we can analyze the evolution of services faster and more economically. 


Reply to this item at:


  Message sent via/by Savannah

reply via email to

[Prev in Thread] Current Thread [Next in Thread]