Developer’s Guide¶
The following guide is intended for those interested in the inner workings of nodepool and its various processes.
Operation¶
If you send a SIGUSR2 to one of the daemon processes, Nodepool will
dump a stack trace for each running thread into its debug log. It is
written under the log bucket nodepool.stack_dump
. This is useful
for tracking down deadlock or otherwise slow threads.
Nodepool Builder¶
The following is the overall diagram for the nodepool-builder process and its most important pieces:
+-----------------+
| ZooKeeper |
+-----------------+
^ |
bld | | watch
+------------+ req | | trigger
| client +---------+ | +--------------------+
+------------+ | | NodepoolBuilderApp |
| +---+----------------+
| |
| | start/stop
| |
+-------v-------+ |
| <-------+
+---------> NodePool- <----------+
| +---+ Builder +---+ |
| | | | | |
| | +---------------+ | |
| | | |
done | | start start | | done
| | bld upld | |
| | | |
| | | |
+---------v---+ +---v----------+
| BuildWorker | | UploadWorker |
+-+-------------+ +-+--------------+
| BuildWorker | | UploadWorker |
+-+-------------+ +-+--------------+
| BuildWorker | | UploadWorker |
+-------------+ +--------------+
Drivers¶
-
class
nodepool.driver.
Driver
¶ The Driver interface
This is the main entrypoint for a Driver. A single instance of this will be created for each driver in the system and will persist for the lifetime of the process.
The class or instance attribute name must be provided as a string.
-
abstract
getProvider
(provider_config)¶ Return a Provider instance
- Parameters
provider_config (dict) – A ProviderConfig instance
-
abstract
getProviderConfig
(provider)¶ Return a ProviderConfig instance
- Parameters
provider (dict) – The parsed provider configuration
-
reset
()¶ Called before loading configuration to reset any global state
-
abstract
-
class
nodepool.driver.
Provider
¶ The Provider interface
Drivers implement this interface to supply Providers. Each “provider” in the nodepool configuration corresponds to an instance of a class which implements this interface.
If the configuration is changed, old provider instances will be stopped and new ones created as necessary.
The class or instance attribute name must be provided as a string.
-
abstract
cleanupLeakedResources
()¶ Clean up any leaked resources
This is called periodically to give the provider a chance to clean up any resources which make have leaked.
-
abstract
cleanupNode
(node_id)¶ Cleanup a node after use
The driver may delete the node or return it to the pool. This may be called after the node was used, or as part of cleanup from an aborted launch attempt.
- Parameters
node_id (str) – The id of the node
-
abstract
getRequestHandler
(poolworker, request)¶ Return a NodeRequestHandler for the supplied request
-
abstract
join
()¶ Wait for provider to finish
On shutdown, this is called after
stop()
and should return when the provider has completed all tasks. This may not be called on reconfiguration (so drivers should not rely on this always being called after stop).
-
abstract
labelReady
(name)¶ Determine if a label is ready in this provider
If the pre-requisites for this label are ready, return true. For example, if the label requires an image that is not present, this should return False. This method should not examine inventory or quota. In other words, it should return True if a request for the label would be expected to succeed with no resource contention, but False if is not possible to satisfy a request for the label.
- Parameters
name (str) – The name of the label
- Returns
True if the label is ready in this provider, False otherwise.
-
abstract
start
(zk_conn)¶ Start this provider
- Parameters
zk_conn (ZooKeeper) – A ZooKeeper connection object.
This is called after each configuration change to allow the driver to perform initialization tasks and start background threads. The ZooKeeper connection object is provided if the Provider needs to interact with it.
-
abstract
stop
()¶ Stop this provider
Before shutdown or reconfiguration, this is called to signal to the driver that it will no longer be used. It should not begin any new tasks, but may allow currently running tasks to continue.
-
abstract
waitForNodeCleanup
(node_id)¶ Wait for a node to be cleaned up
When called, this will be called after
cleanupNode()
.This method should return after the node has been deleted or returned to the pool.
- Parameters
node_id (str) – The id of the node
-
abstract
-
class
nodepool.driver.
ProviderNotifications
¶ Notification interface for
Provider
objects.This groups all notification messages bound for the Provider. The Provider class inherits from this by default. A Provider overrides the methods here if they want to handle the notification.
-
nodeDeletedNotification
(node)¶ Called after the ZooKeeper object for a node is deleted.
- Parameters
node (Node) – Object describing the node just deleted.
-
-
class
nodepool.driver.
NodeRequestHandler
(pw, request)¶ Class to process a single nodeset request.
The PoolWorker thread will instantiate a class of this type for each node request that it pulls from ZooKeeper.
Subclasses are required to implement the launch method.
-
abstract property
alive_thread_count
¶ Return the number of active node launching threads in use by this request handler.
This is used to limit request handling threads for a provider.
This is an approximate, top-end number for alive threads, since some threads obviously may have finished by the time we finish the calculation.
- Returns
A count (integer) of active threads.
-
checkReusableNode
(node)¶ Handler may implement this to verify a node can be re-used. The OpenStack handler uses this to verify the node az is correct.
-
hasProviderQuota
(node_types)¶ Checks if a provider has enough quota to handle a list of nodes. This does not take our currently existing nodes into account.
- Parameters
node_types – list of node types to check
- Returns
True if the node list fits into the provider, False otherwise
-
hasRemainingQuota
(ntype)¶ Checks if the predicted quota is enough for an additional node of type ntype.
- Parameters
ntype – node type for the quota check
- Returns
True if there is enough quota, False otherwise
-
abstract
imagesAvailable
()¶ Handler needs to implement this to determines if the requested images in self.request.node_types are available for this provider.
- Returns
True if it is available, False otherwise.
-
abstract
launch
(node)¶ Handler needs to implement this to launch the node.
-
abstract
launchesComplete
()¶ Handler needs to implement this to check if all nodes in self.nodeset have completed the launch sequence..
This method will be called periodically to check on launch progress.
- Returns
True if all launches are complete (successfully or not), False otherwise.
-
run
()¶ Execute node request handling.
This code is designed to be re-entrant. Because we can’t always satisfy a request immediately (due to lack of provider resources), we need to be able to call run() repeatedly until the request can be fulfilled. The node set is saved and added to between calls.
-
setNodeMetadata
(node)¶ Handler may implement this to store driver-specific metadata in the Node object before building the node. This data is normally dynamically calculated during runtime. The OpenStack handler uses this to set az, cloud and region.
-
unlockNodeSet
(clear_allocation=False)¶ Attempt unlocking all Nodes in the node set.
- Parameters
clear_allocation (bool) – If true, clears the node allocated_to attribute.
-
abstract property
-
class
nodepool.driver.
NodeRequestHandlerNotifications
¶ Notification interface for
NodeRequestHandler
objects.This groups all notification messages bound for the NodeRequestHandler. The NodeRequestHandler class inherits from this by default. A request handler overrides the methods here if they want to handle the notification.
-
nodeReusedNotification
(node)¶ Handler may implement this to be notified when a node is re-used. The OpenStack handler uses this to set the choozen_az.
-
-
class
nodepool.driver.
ProviderConfig
(provider)¶ The Provider config interface
The class or instance attribute name must be provided as a string.
-
abstract
getSchema
()¶ Return a voluptuous schema for config validation
-
abstract
getSupportedLabels
(pool_name=None)¶ Return a set of label names supported by this provider.
- Parameters
pool_name (str) – If provided, get labels for the given pool only.
-
abstract
load
(newconfig)¶ Update this config object from the supplied parsed config
-
abstract property
manage_images
¶ Return True if provider manages external images, False otherwise.
-
abstract property
pools
¶ Return a dict of ConfigPool-based objects, indexed by pool name.
-
abstract
Writing A New Provider Driver¶
Nodepool drivers are loaded from the nodepool/drivers directory. A driver is composed of three main objects:
A ProviderConfig to manage validation and loading of the provider.
A Provider to manage resource allocations.
A NodeRequestHandler to manage nodeset (collection of resource) allocations.
Those objects are referenced from the Driver main interface that needs to be implemented in the __init__.py file of the driver directory.
ProviderConfig¶
The ProviderConfig is constructed with the driver object and the provider configuration dictionary.
The main procedures of the ProviderConfig are:
getSchema() exposes a voluptuous schema of the provider configuration.
load(config) parses the provider configuration. Note that the config argument is the global Nodepool.yaml configuration. Each provided labels need to be referenced back to the global config.labels dictionary so that the launcher service know which provider provide which labels.
Provider¶
The Provider is constructed with the ProviderConfig.
The main procedures of the Provider are:
cleanupNode(external_id) terminates a resource
listNodes() returns the list of existing resources. This procedure needs to map the nodepool_node_id with each resource. If the provider doesn’t support resource metadata, the driver needs to implement a storage facility to associate resource created by Nodepool with the internal nodepool_node_id. The launcher periodically look for non-existent node_id in listNodes() to delete any leaked resources.
getRequestHandler(pool, request) returns a NodeRequestHandler object to manage the creation of resources. The contract between the handler and the provider is free form. As a rule of thumb, the handler should be in charge of interfacing with Nodepool’s database while the provider should provides primitive to create resources. For example the Provider is likely to implement a createResource(pool, label) procedure that will be used by the handler.
NodeRequestHandler¶
The NodeRequestHandler is constructed with the assigned pool and the request object. Before the handler is used, the following attributes are set:
self.provider : the provider configuration.
self.pool : the pool configuration.
self.zk : the database client.
self.manager : the Provider object.
The main procedures of the NodeRequestHandler are:
launch(node) starts the creation of a new resource.
launchesComplete() returns True if all the node of the nodesets self attributes are READY.
An Handler may not have to launch each node of the nodesets as Nodepool will re-use existing nodes.
The launch procedure usually consists of the following operations:
Use the provider to create the resources associated with the node label. Once an external_id is obtained, it should be stored to the node.external_id.
Once the resource is created, READY should be stored to the node.state. Otherwise raise an exception to restart the launch attempt.