LeanBigData aims at addressing three open challenges in big data analytics: 1) The cost, in terms of resources, of scaling big data analytics for streaming and static data sources; 2) The lack of integration of existing big data management technologies and their high response time; 3) The insufficient end-user support leading to extremely lengthy big data analysis cycles. LeanBigData will address these challenges by:Architecting and developing three resource-efficient Big Data management systems typically involved in Big Data processing: a novel transactional NoSQL key-value data store, a distributed complex event processing (CEP) system, and a distributed SQL query engine. We will achieve at least one order of magnitude in efficiency by removing overheads at all levels of the big-data analytics stack and we will take into account technology trends in multicore technologies and non-volatile memories. Providing an integrated big data platform with these three main technologies used for big data, NoSQL, SQL, and Streaming/CEP that will improve response time for unified analytics over multiple sources and large amounts of data avoiding the inefficiencies and delays introduced by existing extract-transfer-load approaches. To achieve this we will use fine-grain intra-query and intra-operator parallelism that will lead to sub-second response times.Supporting an end-to-end big data analytics solution removing the four main sources of delays in data analysis cycles by using: 1) automated discovery of anomalies and root cause analysis; 2) incremental visualization of long analytical queries; 3) drag-and-drop declarative composition of visualizations; and 4) efficient manipulation of visualizations through hand gestures over 3D/holographic views.Finally, LeanBigData will demonstrate these results in a cluster with 1,000 cores in four real industrial use cases with real data, paving the way for deployment in the context of realistic business processes.
LeanBigData: Ultra-Scalable and Ultra-Efficient Integrated and Visual Big Data Analytics
- ORBIT: Business Continuity as a Service
- EXPERIMEDIA: EXPERiments in live social and networked MEDIA experiences