Author: Matt Casters
Publisher: John Wiley & Sons
Release Date: 2010-09-02
A complete guide to Pentaho Kettle, the Pentaho Data lntegration toolset for ETL This practical book is a complete guide to installing, configuring, and managing Pentaho Kettle. If you’re a database administrator or developer, you’ll first get up to speed on Kettle basics and how to apply Kettle to create ETL solutions—before progressing to specialized concepts such as clustering, extensibility, and data vault models. Learn how to design and build every phase of an ETL solution. Shows developers and database administrators how to use the open-source Pentaho Kettle for enterprise-level ETL processes (Extracting, Transforming, and Loading data) Assumes no prior knowledge of Kettle or ETL, and brings beginners thoroughly up to speed at their own pace Explains how to get Kettle solutions up and running, then follows the 34 ETL subsystems model, as created by the Kimball Group, to explore the entire ETL lifecycle, including all aspects of data warehousing with Kettle Goes beyond routine tasks to explore how to extend Kettle and scale Kettle solutions using a distributed “cloud” Get the most out of Pentaho Kettle and your data warehousing with this detailed guide—from simple single table data migration to complex multisystem clustered data integration tasks.
Get up and running with the Pentaho Data Integration tool using this hands-on, easy-to-read guide About This Book Manipulate your data by exploring, transforming, validating, and integrating it using Pentaho Data Integration 8 CE A comprehensive guide exploring the features of Pentaho Data Integration 8 CE Connect to any database engine, explore the databases, and perform all kind of operations on relational databases Who This Book Is For This book is a must-have for software developers, business intelligence analysts, IT students, or anyone involved or interested in developing ETL solutions. If you plan on using Pentaho Data Integration for doing any data manipulation task, this book will help you as well. This book is also a good starting point for data warehouse designers, architects, or anyone who is responsible for data warehouse projects and needs to load data into them. What You Will Learn Explore the features and capabilities of Pentaho Data Integration 8 Community Edition Install and get started with PDI Learn the ins and outs of Spoon, the graphical designer tool Learn to get data from all kind of data sources, such as plain files, Excel spreadsheets, databases, and XML files Use Pentaho Data Integration to perform CRUD (create, read, update, and delete) operations on relationaldatabases Populate a data mart with Pentaho Data Integration Use Pentaho Data Integration to organize files and folders, run daily processes, deal with errors, and more In Detail Pentaho Data Integration(PDI) is an intuitive and graphical environment packed with drag-and-drop design and powerful Extract-Tranform-Load (ETL) capabilities. This book shows and explains the new interactive features of Spoon, the revamped look and feel, and the newest features of the tool including transformations and jobs Executors and the invaluable Metadata Injection capability. We begin with the installation of PDI software and then move on to cover all the key PDI concepts. Each of the chapter introduces new features, enabling you to gradually get practicing with the tool. First, you will learn to do all kind of data manipulation and work with simple plain files. Then, the book teaches you how you can work with relational databases inside PDI. Moreover, you will be given a primer on data warehouse concepts and you will learn how to load data in a data warehouse. During the course of this book, you will be familiarized with its intuitive, graphical and drag-and-drop design environment. By the end of this book, you will learn everything you need to know in order to meet your data manipulation requirements. Besides, your will be given best practices and advises for designing and deploying your projects. Style and approach Step by step guide filled with practical, real world scenarios and examples.
The 2016 2nd International Conference on Energy Equipment Science and Engineering (ICEESE 2016) was held on November 12-14, 2016 in Guangzhou, China. ICEESE 2016 brought together innovative academics and industrial experts in the field of energy equipment science and engineering to a common forum. The primary goal of the conference is to promote research and developmental activities in energy equipment science and engineering and another goal is to promote scientific information interchange between researchers, developers, engineers, students, and practitioners working all around the world. The conference will be held every year to make it an ideal platform for people to share views and experiences in energy equipment science and engineering and related areas. This second volume of the two-volume set of proceedings covers the field of Structural and Materials Sciences, and Computer Simulation & Computer and Electrical Engineering.
The two-volume proceedings of the ACIIDS 2015 conference, LNAI 9011 + 9012, constitutes the refereed proceedings of the 7th Asian Conference on Intelligent Information and Database Systems, held in Bali, Indonesia, in March 2015. The total of 117 full papers accepted for publication in these proceedings was carefully reviewed and selected from 332 submissions. They are organized in the following topical sections: semantic web, social networks and recommendation systems; text processing and information retrieval; intelligent database systems; intelligent information systems; decision support and control systems; machine learning and data mining; multiple model approach to machine learning; innovations in intelligent systems and applications; bio-inspired optimization techniques and their applications; machine learning in biometrics and bioinformatics with applications; advanced data mining techniques and applications; collective intelligent systems for e-market trading, technology opportunity discovery and collaborative learning; intelligent information systems in security and defense; analysis of image, video and motion data in life sciences; augmented reality and 3D media; cloud based solutions; internet of things, big data and cloud computing; and artificial intelligent techniques and their application in engineering and operational research.
With this textbook, Vaisman and Zimányi deliver excellent coverage of data warehousing and business intelligence technologies ranging from the most basic principles to recent findings and applications. To this end, their work is structured into three parts. Part I describes “Fundamental Concepts” including multi-dimensional models; conceptual and logical data warehouse design and MDX and SQL/OLAP. Subsequently, Part II details “Implementation and Deployment,” which includes physical data warehouse design; data extraction, transformation, and loading (ETL) and data analytics. Lastly, Part III covers “Advanced Topics” such as spatial data warehouses; trajectory data warehouses; semantic technologies in data warehouses and novel technologies like Map Reduce, column-store databases and in-memory databases. As a key characteristic of the book, most of the topics are presented and illustrated using application tools. Specifically, a case study based on the well-known Northwind database illustrates how the concepts presented in the book can be implemented using Microsoft Analysis Services and Pentaho Business Analytics. All chapters are summarized using review questions and exercises to support comprehensive student learning. Supplemental material to assist instructors using this book as a course text is available at http://cs.ulb.ac.be/DWSDIbook/, including electronic versions of the figures, solutions to all exercises, and a set of slides accompanying each chapter. Overall, students, practitioners and researchers alike will find this book the most comprehensive reference work on data warehouses, with key topics described in a clear and educational style.
Author: Roland Bouman
Publisher: John Wiley & Sons
Release Date: 2011-02-08
Your all-in-one resource for using Pentaho with MySQL for Business Intelligence and Data Warehousing Open-source Pentaho provides business intelligence (BI) and data warehousing solutions at a fraction of the cost of proprietary solutions. Now you can take advantage of Pentaho for your business needs with this practical guide written by two major participants in the Pentaho community. The book covers all components of the Pentaho BI Suite. You'll learn to install, use, and maintain Pentaho-and find plenty of background discussion that will bring you thoroughly up to speed on BI and Pentaho concepts. Of all available open source BI products, Pentaho offers the most comprehensive toolset and is the fastest growing open source product suite Explains how to build and load a data warehouse with Pentaho Kettle for data integration/ETL, manually create JFree (pentaho reporting services) reports using direct SQL queries, and create Mondrian (Pentaho analysis services) cubes and attach them to a JPivot cube browser Review deploying reports, cubes and metadata to the Pentaho platform in order to distribute BI solutions to end-users Shows how to set up scheduling, subscription and automatic distribution The companion Web site provides complete source code examples, sample data, and links to related resources.
This book gives step-by-step instructions on how to do things. The basics are explained first and then examples help to clarify and reinforce the principles. The book is aimed at experienced Java developers and system architects who want to develop complex Java applications using the OSWorkflow workflow engine. OSWorkflow is a flexible low-level workflow implementation for developers and architects; it is not a quick "plug-and-play" solution for non-technical end users.