
The Cluster API is responsible for managing Genie's logical worker cluster and doesn't include the infrastructure itself. The Clusters API, Commands API, Applications API, and a Jobs API provide the semantics required to operate Genie 3. The Genie 3 API components cover the scope of Genie's functionality. As a developer, the local-mode workflow generates run-scripts for various runtimes, but also integrates with the underlying implementation REPL's and stdout for testing and development support. Genie generates application run scripts for things like Spark, Hadoop, Pig, Hive, PrestoDB, and Sqoop independent of the specifc runtime configuration, or data to process. Application runtimes and their executable commands are configurable via their API's. In Genie 3, tasks are composed of several abstractions that ensure scalability . The Genie 3 approach is to keep runtimes and their configuration modular, descriptive, and versioned with an improved data model.
Genie netflix client code#
The single run script for all task runtimes in earlier versions outgrew what could safely separate concerns, and reduced the project maintainers' ability to isolate risk when introducing code changes as the project grew.
Genie netflix client manual#
Now, cluster leadership is supported through Zookeeper or as a manual configuration property set to a single node's IP address. Netflix announced that Genie 3 has several new features, including a redesign of the earlier task execution engine, security functionality, dependency caching, and API changes.Įarlier versions of the genie engine didn't have leadership election, resulting in workers unnecessarily executing the same tasks.


Genie has two primary use cases: the first is for creating and submitting custom data-processing task requests, the second is for setting up local environments to develop and test new applications and tasks to running on a Genie cluster. Genie is a distributed, RESTful task orchestration engine for the data platform from Netflix.
