Just a few of 1000s of Use Cases
The ease of working with StreamZero with it's ability to rewire services at will and integrate across clouds and services Agility into engineering.
AwsGlueJobOperator
This operator is deprecated. Please use airflow.providers.amazon.aws.operators.glue.GlueJobOperator.
GlueCrawlerSensor
Waits for an AWS Glue crawler to reach any of the statuses below FAILED, CANCELLED, SUCCEEDED
AwsGlueJobSensor
This sensor is deprecated. Please use airflow.providers.amazon.aws.sensors.glue.GlueJobSensor.
GlueJobOperator
Creates an AWS Glue Job. AWS Glue is a serverless Spark ETL service for running Spark Jobs on the AWS cloud. Language support Python and Scala
GlueCrawlerOperator
Creates, updates and triggers an AWS Glue Crawler. AWS Glue Crawler is a serverless service that manages a catalog of metadata tables that contain the inferred schema, format and data types of data stores within the AWS cloud.
AwsGlueCrawlerOperator
This operator is deprecated. Please use airflow.providers.amazon.aws.operators.glue_crawler.GlueCrawlerOperator.
GlueJobSensor
Waits for an AWS Glue Job to reach any of the status below FAILED, STOPPED, SUCCEEDED
AwsGlueCrawlerSensor
This sensor is deprecated. Please use airflow.providers.amazon.aws.sensors.glue_crawler.GlueCrawlerSensor.
BaseSensorOperator
Sensor operators are derived from this class and inherit these attributes.
S3ToSFTPOperator
This operator enables the transferring of files from S3 to a SFTP server.
BigQueryDeleteDatasetOperator
This operator deletes an existing dataset from your Project in Big query.
BigQueryUpdateTableOperator
This operator is used to update table for your Project in BigQuery. Use fields to specify which fields of table to update. If a field is listed in fields and is None in table, it will be deleted.
BigQueryCheckOperator
Performs checks against BigQuery. The BigQueryCheckOperator expects a sql query that will return a single row. Each value on that first row is evaluated using python bool casting. If any of the values return False the check is failed and errors out.
BigQueryCreateExternalTableOperator
Creates a new external table in the dataset with the data from Google Cloud Storage.
BigQueryInsertJobOperator
Executes a BigQuery job. Waits for the job to complete and returns job id.
BigQueryCreateEmptyTableOperator
Creates a new, empty table in the specified BigQuery dataset, optionally with schema.
BigQueryGetDatasetOperator
This operator is used to return the dataset specified by dataset_id.
BigQueryGetDatasetTablesOperator
This operator retrieves the list of tables in the specified dataset.
BigQueryExecuteQueryOperator
Executes BigQuery SQL queries in a specific BigQuery database. This operator does not assert idempotency.
BigQueryUpdateDatasetOperator
This operator is used to update dataset for your Project in BigQuery. Use fields to specify which fields of dataset to update. If a field is listed in fields and is None in dataset, it will be deleted. If no fields are provided then all fields of provided dataset_resource will be used.
BigQueryPatchDatasetOperator
This operator is used to patch dataset for your Project in BigQuery. It only replaces fields that are provided in the submitted dataset resource.
BigQueryGetDataOperator
Fetches the data from a BigQuery table (alternatively fetch data for selected columns) and returns data in a python list. The number of elements in the returned list will be equal to the number of rows fetched. Each element in the list will again be a list where element would represent the columns values for that row.
BigQueryIntervalCheckOperator
Checks that the values of metrics given as SQL expressions are within a certain tolerance of the ones from days_back before.
BigQueryUpdateTableSchemaOperator
Update BigQuery Table Schema Updates fields on a table schema based on contents of the supplied schema_fields_updates parameter. The supplied schema does not need to be complete, if the field already exists in the schema you only need to supply keys & values for the items you want to patch, just ensure the “name” key is set.
BigQueryCreateEmptyDatasetOperator
This operator is used to create new dataset for your Project in BigQuery.
DataflowTemplatedJobStartOperator
Start a Templated Cloud Dataflow job. The parameters of the operation will be passed to the job.
DataflowCreatePythonJobOperator
Launching Cloud Dataflow jobs written in python. Note that both dataflow_default_options and options will be merged to specify pipeline execution parameter, and dataflow_default_options is expected to save high-level options, for instances, project and zone information, which apply to all dataflow operators in the DAG.
DataflowCreateJavaJobOperator
Start a Java Cloud Dataflow batch job. The parameters of the operation will be passed to the job.
BaseSecretsBackend
Abstract base class to retrieve Connection object given a conn_id or Variable given a key
EnvironmentVariablesBackend
Retrieves Connection object and Variable from environment variable.
GCSFileTransformOperator
Copies data from a source GCS location to a temporary location on the local filesystem. Runs a transformation on this file as specified by the transformation script and uploads the output to a destination bucket. If the output bucket is not specified the original file will be overwritten.
GCSDeleteObjectsOperator
Deletes objects from a Google Cloud Storage bucket, either from an explicit list of object names or all objects matching a prefix.
GCSTimeSpanFileTransformOperator
Determines a list of objects that were added or modified at a GCS source location during a specific time-span, copies them to a temporary location on the local file system, runs a transform on this file as specified by the transformation script and uploads the output to the destination bucket.
GCSCreateBucketOperator
Creates a new bucket. Google Cloud Storage uses a flat namespace, so you cant create a bucket with a name that is already in use.
GCSSynchronizeBucketsOperator
Synchronizes the contents of the buckets or buckets directories in the Google Cloud Services.
GCSListObjectsOperator
List all objects from the bucket with the given string prefix and delimiter in name.
EmrCreateJobFlowOperator
Creates an EMR JobFlow, reading the config from the EMR connection. A dictionary of JobFlow overrides can be passed that override the config from the connection.
EmrJobFlowSensor
Asks for the state of the EMR JobFlow (Cluster) until it reaches any of the target states. If it fails the sensor errors, failing the task.
EmrContainerSensor
Asks for the state of the job run until it reaches a failure state or success state. If the job run fails, the task will fail.
EmrStepSensor
Asks for the state of the step until it reaches any of the target states. If it fails the sensor errors, failing the task.
DataprocMetastoreCreateServiceOperator
Creates a metastore service in a project and location.
DataprocMetastoreCreateBackupOperator
Creates a new backup in a given project and location.
DataprocMetastoreCreateMetadataImportOperator
Creates a new MetadataImport in a given project and location.
DatabricksRunNowOperator
Runs an existing Spark job run to Databricks using the api/2.0/jobs/run-now API endpoint.
DatabricksSubmitRunOperator
Submits a Spark job run to Databricks using the api/2.0/jobs/runs/submit API endpoint.
GreatExpectationsOperator
An operator to leverage Great Expectations as a task in your Airflow DAG.
GKEStartPodOperator
Executes a task in a Kubernetes pod in the specified Google Kubernetes Engine cluster
GKECreateClusterOperator
Create a Google Kubernetes Engine Cluster of specified dimensions The operator will wait until the cluster is created.
GKEDeleteClusterOperator
Deletes the cluster, including the Kubernetes endpoint and all worker nodes.
TriggerDagRunLink
Operator link for TriggerDagRunOperator. It allows users to access DAG triggered by task using TriggerDagRunOperator.
PythonVirtualenvOperator
Allows one to run a function in a virtualenv that is created and destroyed automatically (with certain caveats).
ShortCircuitOperator
Allows a workflow to continue only if a condition is met. Otherwise, the workflow “short-circuits” and downstream tasks are skipped.
BranchPythonOperator
Allows a workflow to “branch” or follow a path following the execution of this task.
ECSOperator
This operator is deprecated. Please use airflow.providers.amazon.aws.operators.ecs.EcsOperator.
CloudSQLDatabaseHook
Serves DB connection configuration for Google Cloud SQL (Connections of gcpcloudsqldb type).
GoogleDeploymentManagerHook
Interact with Google Cloud Deployment Manager using the Google Cloud connection. This allows for scheduled and programmatic inspection and deletion fo resources managed by GDM.
S3TaskHandler
S3TaskHandler is a python log handler that handles and reads task instance logs. It extends airflow FileTaskHandler and uploads to and reads from S3 remote storage.
EmrContainerHook
Interact with AWS EMR Virtual Cluster to run, poll jobs and return job status Additional arguments (such as aws_conn_id) may be specified and are passed down to the underlying AwsBaseHook.
EmrHook
Interact with AWS EMR. emr_conn_id is only necessary for using the create_job_flow method.
DataprocCreateClusterOperator
Create a new cluster on Google Cloud Dataproc. The operator will wait until the creation is successful or an error occurs in the creation process. If the cluster already exists and use_if_exists is True then the operator will.
DataprocInstantiateWorkflowTemplateOperator
Instantiate a WorkflowTemplate on Google Cloud Dataproc. The operator will wait until the WorkflowTemplate is finished executing.
DataprocSubmitSparkSqlJobOperator
Start a Spark SQL query Job on a Cloud DataProc cluster.
DataprocInstantiateInlineWorkflowTemplateOperator
Instantiate a WorkflowTemplate Inline on Google Cloud Dataproc. The operator will wait until the WorkflowTemplate is finished executing.
DataprocScaleClusterOperator
Scale, up or down, a cluster on Google Cloud Dataproc. The operator will wait until the cluster is re-scaled.
DataprocSubmitPigJobOperator
Start a Pig query Job on a Cloud DataProc cluster. The parameters of the operation will be passed to the cluster.
CloudBuildDeleteBuildTriggerOperator
Deletes a BuildTrigger by its project ID and trigger ID.
CloudBuildRetryBuildOperator
Creates a new build based on the specified build. This method creates a new build using the original build request, which may or may not result in an identical build.
CloudBuildUpdateBuildTriggerOperator
Updates a BuildTrigger by its project ID and trigger ID.
AWSAthenaHook
This hook is deprecated. Please use airflow.providers.amazon.aws.hooks.athena.AthenaHook.
DynamoDBToS3Operator
Replicates records from a DynamoDB table to S3. It scans a DynamoDB table and write the received records to a file on the local filesystem. It flushes the file to S3 once the file size exceeds the file size limit specified by the user.
BatchSensor
Asks for the state of the Batch Job execution until it reaches a failure state or success state. If the job fails, the task will fail.
AwsBatchOperator
This operator is deprecated. Please use airflow.providers.amazon.aws.operators.batch.BatchOperator.
_DockerDecoratedOperator
Wraps a Python callable and captures args/kwargs when called for execution.
SFTPHook
This hook is inherited from SSH hook. Please refer to SSH hook for the input arguments.
FromXLSXQueryOperator
Execute an SQL query an XLSX/XLS file and export the result into a Parquet or CSV file
CloudMemorystoreMemcachedUpdateInstanceOperator
Updates the metadata and configuration of a specific Memcached instance.
CloudMemorystoreCreateInstanceAndImportOperator
Creates a Redis instance based on the specified tier and memory size and import a Redis RDB snapshot file from Cloud Storage into a this instance.
CloudMemorystoreMemcachedGetInstanceOperator
Gets the details of a specific Memcached instance.
CloudMemorystoreImportOperator
Import a Redis RDB snapshot file from Cloud Storage into a Redis instance.
CloudMemorystoreExportAndDeleteInstanceOperator
Export Redis instance data into a Redis RDB format file in Cloud Storage. In next step, deletes a this instance.
CloudMemorystoreMemcachedApplyParametersOperator
Will update current set of Parameters to the set of specified nodes of the Memcached Instance.
CloudMemorystoreExportInstanceOperator
Export Redis instance data into a Redis RDB format file in Cloud Storage.
CloudMemorystoreMemcachedCreateInstanceOperator
Creates a Memcached instance based on the specified tier and memory size.
CloudMemorystoreListInstancesOperator
Lists all Redis instances owned by a project in either the specified location (region) or all locations.
CloudMemorystoreScaleInstanceOperator
Updates the metadata and configuration of a specific Redis instance.
CloudMemorystoreCreateInstanceOperator
Creates a Redis instance based on the specified tier and memory size.
CloudMemorystoreDeleteInstanceOperator
Deletes a specific Redis instance. Instance stops serving and data is deleted.
CloudMemorystoreFailoverInstanceOperator
Initiates a failover of the primary node to current replica node for a specific STANDARD tier Cloud Memorystore for Redis instance.
CloudMemorystoreMemcachedUpdateParametersOperator
parameters, it must be followed by apply_parameters to apply the parameters to nodes of the Memcached Instance.
CloudMemorystoreMemcachedDeleteInstanceOperator
Deletes a specific Memcached instance. Instance stops serving and data is deleted.
CloudMemorystoreUpdateInstanceOperator
Updates the metadata and configuration of a specific Redis instance.
CloudDataFusionCreateInstanceOperator
Creates a new Data Fusion instance in the specified project and location.
CloudDataFusionStartPipelineOperator
Starts a Cloud Data Fusion pipeline. Works for both batch and stream pipelines.
CloudDataFusionStopPipelineOperator
Stops a Cloud Data Fusion pipeline. Works for both batch and stream pipelines.
CloudDataFusionRestartInstanceOperator
Restart a single Data Fusion instance. At the end of an operation instance is fully restarted.
AwsLambdaHook
This hook is deprecated. Please use airflow.providers.amazon.aws.hooks.lambda_function.LambdaHook.
HttpSensor
Executes a HTTP GET statement and returns False on failure caused by 404 Not Found or response_check returning False.
S3DeleteObjectsOperator
To enable users to delete single object or multiple objects from a bucket using a single HTTP request.
S3FileTransformOperator
Copies data from a source S3 location to a temporary location on the local filesystem. Runs a transformation on this file as specified by the transformation script and uploads the output to a destination S3 location.
S3ListPrefixesOperator
List all subfolders from the bucket with the given string prefix in name.
SparkSubmitHook
This hook is a wrapper around the spark-submit binary to kick off a spark-submit job. It requires that the “spark-submit” binary is in the PATH or the spark_home to be supplied.
BigQueryDataTransferServiceStartTransferRunsOperator
Start manual transfer runs to be executed now with schedule_time equal to current time. The transfer runs can be created for a time range where the run_time is between start_time (inclusive) and end_time (exclusive), or for a specific run_time.
EksFargateProfileStateSensor
Check the state of an AWS Fargate profile until it reaches the target state or another terminal state.
EKSFargateProfileStateSensor
This sensor is deprecated. Please use airflow.providers.amazon.aws.sensors.eks.EksFargateProfileStateSensor.
EKSNodegroupStateSensor
This sensor is deprecated. Please use airflow.providers.amazon.aws.sensors.eks.EksNodegroupStateSensor.
EKSClusterStateSensor
This sensor is deprecated. Please use airflow.providers.amazon.aws.sensors.eks.EksClusterStateSensor.
EksNodegroupStateSensor
Check the state of an EKS managed node group until it reaches the target state or another terminal state.
EksClusterStateSensor
Check the state of an Amazon EKS Cluster until it reaches the target state or another terminal state.
BeamRunPythonPipelineOperator
Launching Apache Beam pipelines written in Python. Note that both default_pipeline_options and pipeline_options will be merged to specify pipeline execution parameter, and default_pipeline_options is expected to save high-level options, for instances, project and zone information, which apply to all beam operators in the DAG.
LivyOperator
This operator wraps the Apache Livy batch REST API, allowing to submit a Spark application to the underlying cluster.
CloudDLPHook
Hook for Google Cloud Data Loss Prevention (DLP) APIs. Cloud DLP allows clients to detect the presence of Personally Identifiable Information (PII) and other privacy-sensitive data in user-supplied, unstructured data streams, like text blocks or images. The service also includes methods for sensitive data redaction and scheduling of data scans on Google Cloud based data sets.
CloudTasksHook
Hook for Google Cloud Tasks APIs. Cloud Tasks allows developers to manage the execution of background work in their applications.
GoogleBaseHook
A base hook for Google cloud-related hooks. Google cloud has a shared REST API client that is built in the same way no matter which service you use. This class helps construct and authorize the credentials needed to then call googleapiclient.discovery.build() to actually discover and build a client for a Google cloud service.
GCSTaskHandler
GCSTaskHandler is a python log handler that handles and reads task instance logs. It extends airflow FileTaskHandler and uploads to and reads from GCS remote storage. Upon log reading failure, it reads from host machines local disk.
BigQueryTablePartitionExistenceSensor
Checks for the existence of a partition within a table in Google Bigquery.
S3PrefixSensor
Waits for a prefix or all prefixes to exist. A prefix is the first part of a key, thus enabling checking of constructs similar to glob airfl* or SQL LIKE airfl%. There is the possibility to precise a delimiter to indicate the hierarchy or keys, meaning that the match will stop at that delimiter. Current code accepts sane delimiters, i.e. characters that are NOT special characters in the Python regex engine.
S3KeySensor
Waits for a key (a file-like instance on S3) to be present in a S3 bucket. S3 being a key/value it does not support folders. The path is just a key a resource.
S3KeysUnchangedSensor
Checks for changes in the number of objects at prefix in AWS S3 bucket and returns True if the inactivity period has passed with no increase in the number of objects. Note, this sensor will not behave correctly in reschedule mode, as the state of the listed objects in the S3 bucket will be lost between rescheduled invocations.
S3KeySizeSensor
Waits for a key (a file-like instance on S3) to be present and be more than some size in a S3 bucket. S3 being a key/value it does not support folders. The path is just a key a resource.
ElasticsearchTaskHandler
ElasticsearchTaskHandler is a python log handler that reads logs from Elasticsearch. Note that Airflow does not handle the indexing of logs into Elasticsearch. Instead, Airflow flushes logs into local files. Additional software setup is required to index the logs into Elasticsearch, such as using Filebeat and Logstash. To efficiently query and sort Elasticsearch results, this handler assumes each log message has a field log_id consists of ti primary keys log_id = {dag_id}-{task_id}-{execution_date}-{try_number} Log messages with specific log_id are sorted based on offset, which is a unique integer indicates log messages order. Timestamps here are unreliable because multiple log messages might have the same timestamp.
SensorWork
This class stores a sensor work with decoded context value. It is only used inside of smart sensor.
PostgresToGCSOperator
Copy data from Postgres to Google Cloud Storage in JSON or CSV format.
BaseHook
Abstract base class for hooks, hooks are meant as an interface to interact with external systems. MySqlHook, HiveHook, PigHook return object that can handle the connection and interaction to specific instances of these systems, and expose consistent methods to interact with them.
CloudComposerExecutionTrigger
The trigger handles the async communication with the Google Cloud Composer
EksDeleteClusterOperator
Deletes the Amazon EKS Cluster control plane and all nodegroups attached to it.
AzureBaseHook
This hook acts as a base hook for azure services. It offers several authentication mechanisms to authenticate the client library used for upstream azure hooks.
GithubOperator
GithubOperator to interact and perform action on GitHub API. This operator is designed to use GitHub Python SDK
EksDeleteFargateProfileOperator
Deletes an AWS Fargate profile from an Amazon EKS Cluster.
CloudVisionUpdateProductOperator
Makes changes to a Product resource. Only the display_name, description, and labels fields can be updated right now.
CloudVisionCreateReferenceImageOperator
Creates and returns a new ReferenceImage ID resource.
EKSCreateClusterOperator
This operator is deprecated. Please use airflow.providers.amazon.aws.operators.eks.EksCreateClusterOperator.
CloudVisionAddProductToProductSetOperator
Adds a Product to the specified ProductSet. If the Product is already present, no change is made.
EKSCreateFargateProfileOperator
This operator is deprecated. Please use airflow.providers.amazon.aws.operators.eks.EksCreateFargateProfileOperator.
EKSCreateNodegroupOperator
This operator is deprecated. Please use airflow.providers.amazon.aws.operators.eks.EksCreateNodegroupOperator.
EKSPodOperator
This operator is deprecated. Please use airflow.providers.amazon.aws.operators.eks.EksPodOperator.
CloudVisionUpdateProductSetOperator
Makes changes to a ProductSet resource. Only display_name can be updated currently.
CloudVisionRemoveProductFromProductSetOperator
Removes a Product from the specified ProductSet.
EksDeleteNodegroupOperator
Deletes an Amazon EKS managed node group from an Amazon EKS Cluster.
EKSDeleteClusterOperator
This operator is deprecated. Please use airflow.providers.amazon.aws.operators.eks.EksDeleteClusterOperator.
EKSDeleteFargateProfileOperator
This operator is deprecated. Please use airflow.providers.amazon.aws.operators.eks.EksDeleteFargateProfileOperator.
CloudVisionImageAnnotateOperator
Run image detection and annotation for an image or a batch of images.
EKSDeleteNodegroupOperator
This operator is deprecated. Please use airflow.providers.amazon.aws.operators.eks.EksDeleteNodegroupOperator.
EksCreateNodegroupOperator
Creates an Amazon EKS managed node group for an existing Amazon EKS Cluster.
CloudVisionDeleteProductSetOperator
Permanently deletes a ProductSet. Products and ReferenceImages in the ProductSet are not deleted. The actual image files are not deleted from Google Cloud Storage.
TrinoToMySqlOperator
Moves data from Trino to MySQL, note that for now the data is loaded into memory before being pushed to MySQL, so this operator should be used for smallish amount of data.
MsSqlToHiveOperator
Moves data from Microsoft SQL Server to Hive. The operator runs your query against Microsoft SQL Server, stores the file locally before loading it into a Hive table. If the create or recreate arguments are set to True, a CREATE TABLE and DROP TABLE statements are generated. Hive data types are inferred from the cursors metadata. Note that the table generated in Hive uses STORED AS textfile which isnt the most efficient serialization format. If a large amount of data is loaded and/or if the table gets queried considerably, you may want to use this operator only to stage the data into a temporary table before loading it into its final destination using a HiveOperator.
HiveToMySqlOperator
Moves data from Hive to MySQL, note that for now the data is loaded into memory before being pushed to MySQL, so this operator should be used for smallish amount of data.
PrestoToMySqlOperator
Moves data from Presto to MySQL, note that for now the data is loaded into memory before being pushed to MySQL, so this operator should be used for smallish amount of data.
ExternalTaskSensor
Waits for a different DAG or a task in a different DAG to complete for a specific execution_date
SpannerDeployDatabaseInstanceOperator
Creates a new Cloud Spanner database, or if database exists, the operator does nothing.
CloudSQLBaseOperator
Abstract base operator for Google Cloud SQL operators to inherit from.
HiveToDruidOperator
Moves data from Hive to Druid, [del]note that for now the data is loaded into memory before being pushed to Druid, so this operator should be used for smallish amount of data.[/del]
SnowflakeCheckOperator
Performs a check against Snowflake. The SnowflakeCheckOperator expects a sql query that will return a single row. Each value on that first row is evaluated using python bool casting. If any of the values return False the check is failed and errors out.
HiveToDynamoDBOperator
Moves data from Hive to DynamoDB, note that for now the data is loaded into memory before being pushed to DynamoDB, so this operator should be used for smallish amount of data.
SQLThresholdCheckOperator
Performs a value check using sql code against a minimum threshold and a maximum threshold. Thresholds can be in the form of a numeric value OR a sql statement that results a numeric.
SpannerUpdateDatabaseInstanceOperator
Updates a Cloud Spanner database with the specified DDL statement.
CloudSQLPatchInstanceDatabaseOperator
Updates a resource containing information about a database inside a Cloud SQL instance using patch semantics.
MySqlToHiveOperator
Moves data from MySql to Hive. The operator runs your query against MySQL, stores the file locally before loading it into a Hive table. If the create or recreate arguments are set to True, a CREATE TABLE and DROP TABLE statements are generated. Hive data types are inferred from the cursors metadata. Note that the table generated in Hive uses STORED AS textfile which isnt the most efficient serialization format. If a large amount of data is loaded and/or if the table gets queried considerably, you may want to use this operator only to stage the data into a temporary table before loading it into its final destination using a HiveOperator.
SQLCheckOperator
Performs checks against a db. The SQLCheckOperator expects a sql query that will return a single row. Each value on that first row is evaluated using python bool casting. If any of the values return False the check is failed and errors out.
SalesforceToGcsOperator
Submits Salesforce query and uploads results to Google Cloud Storage
SQLIntervalCheckOperator
Checks that the values of metrics given as SQL expressions are within a certain tolerance of the ones from days_back before.
CloudSQLImportInstanceOperator
Imports data into a Cloud SQL instance from a SQL dump or CSV file in Cloud Storage.
CloudSQLCreateInstanceDatabaseOperator
Creates a new database inside a Cloud SQL instance.
SpannerDeployInstanceOperator
Creates a new Cloud Spanner instance, or if an instance with the same instance_id exists in the specified project, updates the Cloud Spanner instance.
HiveToSambaOperator
Executes hql code in a specific Hive database and loads the results of the query as a csv to a Samba location.
SpannerQueryDatabaseInstanceOperator
Executes an arbitrary DML query (INSERT, UPDATE, DELETE).
CloudSQLExportInstanceOperator
Exports data from a Cloud SQL instance to a Cloud Storage bucket as a SQL dump or CSV file.
SnowflakeToSlackOperator
Executes an SQL statement in Snowflake and sends the results to Slack. The results of the query are rendered into the slack_message parameter as a Pandas dataframe using a JINJA variable called {{ results_df }}. The results_df variable name can be changed by specifying a different results_df_name parameter. The Tabulate library is added to the JINJA environment as a filter to allow the dataframe to be rendered nicely. as an ascii rendered table.
SpannerDeleteInstanceOperator
Deletes a Cloud Spanner instance. If an instance does not exist, no action is taken and the operator succeeds.
SnowflakeIntervalCheckOperator
Checks that the values of metrics given as SQL expressions are within a certain tolerance of the ones from days_back before.
CloudSQLExecuteQueryOperator
Performs DML or DDL query on an existing Cloud Sql instance. It optionally uses cloud-sql-proxy to establish secure connection with the database.
CloudSQLCreateInstanceOperator
Creates a new Cloud SQL instance. If an instance with the same name exists, no action will be taken and the operator will succeed.
SnowflakeValueCheckOperator
Performs a simple check using sql code against a specified value, within a certain level of tolerance.
VerticaToHiveOperator
Moves data from Vertica to Hive. The operator runs your query against Vertica, stores the file locally before loading it into a Hive table. If the create or recreate arguments are set to True, a CREATE TABLE and DROP TABLE statements are generated. Hive data types are inferred from the cursors metadata. Note that the table generated in Hive uses STORED AS textfile which isnt the most efficient serialization format. If a large amount of data is loaded and/or if the table gets queried considerably, you may want to use this operator only to stage the data into a temporary table before loading it into its final destination using a HiveOperator.
SSHHook
Hook for ssh remote execution using Paramiko. This hook also lets you create ssh tunnel and serve as basis for SFTP file transfer
EmrCreateJobFlowOperator
Creates an EMR JobFlow, reading the config from the EMR connection. A dictionary of JobFlow overrides can be passed that override the config from the connection.
SQSSensor
This sensor is deprecated. Please use airflow.providers.amazon.aws.sensors.sqs.SqsSensor.
SqsSensor
Get messages from an SQS queue and then deletes the message from the SQS queue. If deletion of messages fails an AirflowException is thrown otherwise, the message is pushed through XCom with the key messages.
FireboltHook
A client to interact with Firebolt. This hook requires the firebolt_conn_id connection. The firebolt login, password, and api_endpoint field must be setup in the connection. Other inputs can be defined in the connection or hook instantiation.
CloudantHook
Interact with Cloudant. This class is a thin wrapper around the cloudant python library.
PagerdutyHook
The PagerdutyHook can be used to interact with both the PagerDuty API and the PagerDuty Events API.
SalesforceHook
Creates new connection to Salesforce and allows you to pull data out of SFDC and save it to a file.
MySQLToS3Operator
This class is deprecated. Please use airflow.providers.amazon.aws.transfers.sql_to_s3.SqlToS3Operator.
S3ToGCSOperator
Synchronizes an S3 key, possibly a prefix, with a Google Cloud Storage destination path.
PubSubPullSensor
Pulls messages from a PubSub subscription and passes them through XCom. Always waits for at least one message to be returned from the subscription.
PubSubPullOperator
Pulls messages from a PubSub subscription and passes them through XCom. If the queue is empty, returns empty list - never waits for messages. If you do need to wait, please use airflow.providers.google.cloud.sensors.PubSubPullSensor instead.
QubolePartitionSensor
Wait for a Hive partition to show up in QHS (Qubole Hive Service) and check for its presence via QDS APIs
JenkinsJobTriggerOperator
Trigger a Jenkins Job and monitor its execution. This operator depend on python-jenkins library, version >= 0.4.15 to communicate with jenkins server. Youll also need to configure a Jenkins connection in the connections screen.
BranchDayOfWeekOperator
Branches into one of two lists of tasks depending on the current day.
CloudDataTransferServiceGCSToGCSOperator
Copies objects from a bucket to another using the Google Cloud Storage Transfer Service.
DataprepRunJobGroupOperator
Create a jobGroup, which launches the specified job as the authenticated user. This performs the same action as clicking on the Run Job button in the application. To get recipe_id please follow the Dataprep API documentation
AsanaCreateTaskOperator
This operator can be used to create Asana tasks. For more information on Asana optional task parameters, see
OpsgenieCreateAlertOperator
This operator allows you to post alerts to Opsgenie. Accepts a connection that has an Opsgenie API key as the connections password. This operator sets the domain to conn_id.host, and if not set will default to https://api.opsgenie.com.
QuboleCheckOperator
Performs a simple value check using Qubole command. By default, each value on the first row of this Qubole command is compared with a pre-defined value. The check fails and errors out if the output of the command is not within the permissible limit of expected value.
SlackWebhookOperator
This operator allows you to post messages to Slack using incoming webhooks. Takes both Slack webhook token directly and connection that has Slack webhook token. If both supplied, http_conn_id will be used as base_url, and webhook_token will be taken as endpoint, the relative path of the url.
GoogleDisplayVideo360GetSDFDownloadOperationSensor
Sensor for detecting the completion of SDF operation.
AwsBatchWaitersHook
This hook is deprecated. Please use airflow.providers.amazon.aws.hooks.batch.BatchWaitersHook.
SqoopHook
This hook is a wrapper around the sqoop 1 binary. To be able to use the hook it is required that “sqoop” is in the PATH.
CloudFunctionDeleteFunctionOperator
Deletes the specified function from Google Cloud Functions.
CloudDLPDeleteDLPJobOperator
Deletes a long-running DlpJob. This method indicates that the client is no longer interested in the DlpJob result. The job will be cancelled if possible.
WorkflowsCreateWorkflowOperator
Creates a new workflow. If a workflow with the specified name already exists in the specified project and location, the long running operation will return [ALREADY_EXISTS][google.rpc.Code.ALREADY_EXISTS] error.
CloudSpeechToTextRecognizeSpeechOperator
Recognizes speech from audio file and returns it as text.
FacebookAdsReportToGcsOperator
Fetches the results from the Facebook Ads API as desired in the params Converts and saves the data as a temporary JSON file Uploads the JSON to Google Cloud Storage
StackdriverDisableNotificationChannelsOperator
Disables one or more enabled notification channels identified by filter parameter. Inoperative in case the policy is already disabled.
WorkflowsListWorkflowsOperator
Lists Workflows in a given project and location. The default order is not specified.
OpsgenieCloseAlertOperator
This operator allows you to close alerts to Opsgenie. Accepts a connection that has an Opsgenie API key as the connections password. This operator sets the domain to conn_id.host, and if not set will default to api.opsgenie.com.
AsanaUpdateTaskOperator
This operator can be used to update Asana tasks. For more information on Asana optional task parameters, see developers.asana.com/docs/update-a-task
QuboleFileSensor
Wait for a file or folder to be present in cloud storage and check for its presence via QDS APIs
BigtableTableReplicationCompletedSensor
Sensor that waits for Cloud Bigtable table to be fully replicated to its clusters. No exception will be raised if the instance or the table does not exist.
CloudDataCatalogSearchCatalogOperator
Searches Data Catalog for multiple resources like entries, tags that match a query.
PapermillOperator
Executes a jupyter notebook through papermill that is annotated with parameters
GoogleCampaignManagerDownloadReportOperator
Retrieves a report and uploads it to GCS bucket.
WorkflowsCancelExecutionOperator
Cancels an execution using the given workflow_id and execution_id.
AzureCosmosDocumentSensor
Checks for the existence of a document which matches the given query in CosmosDB.
CloudDLPDeidentifyContentOperator
De-identifies potentially sensitive info from a ContentItem. This method has limits on input size and output size.
GCSObjectsWtihPrefixExistenceSensor
This class is deprecated. Please use airflow.providers.google.cloud.sensors.gcs.GCSObjectsWithPrefixExistenceSensor.
BigQueryDataTransferServiceTransferRunSensor
Waits for Data Transfer Service run to complete.
CloudDLPCreateJobTriggerOperator
Creates a job trigger to run DLP actions such as scanning storage for sensitive information on a set schedule.
QuboleCheckOperator
Performs checks against Qubole Commands. QuboleCheckOperator expects a command that will be executed on QDS. By default, each value on first row of the result of this Qubole Command is evaluated using python bool casting. If any of the values return False, the check is failed and errors out.
DataprepGetJobGroupOperator
Get the specified job group. A job group is a job that is executed from a specific node in a flow. API documentation clouddataprep.com/documentation/api#section/Overview
BaseBranchOperator
This is a base class for creating operators with branching functionality, similarly to BranchPythonOperator.
SQSPublishOperator
This operator is deprecated. Please use airflow.providers.amazon.aws.operators.sqs.SqsPublishOperator.
PrestoToGCSOperator
Copy data from PrestoDB to Google Cloud Storage in JSON or CSV format.
CloudDLPInspectContentOperator
Finds potentially sensitive info in content. This method has limits on input size, processing time, and output size.
EC2InstanceStateSensor
Check the state of the AWS EC2 instance until state of the instance become equal to the target state.
AWSDataSyncOperator
This operator is deprecated. Please use airflow.providers.amazon.aws.operators.datasync.DataSyncOperator.
CloudDLPCreateDLPJobOperator
Creates a new job to inspect storage or calculate risk metrics.
SegmentHook
Create new connection to Segment and allows you to pull data out of Segment or write to it.
GCSToGoogleSheetsOperator
Uploads .csv file from Google Cloud Storage to provided Google Spreadsheet.
AthenaSensor
Asks for the state of the Query until it reaches a failure state or success state. If the query fails, the task will fail.
GoogleAnalyticsRetrieveAdsLinksListOperator
Lists webProperty-Google Ads links for a given web property
StepFunctionGetExecutionOutputOperator
An Operator that begins execution of an Step Function State Machine
CloudDLPRedactImageOperator
Redacts potentially sensitive info from an image. This method has limits on input size, processing time, and output size.
SlackAPIOperator
Base Slack Operator The SlackAPIPostOperator is derived from this operator. In the future additional Slack API Operators will be derived from this class as well. Only one of slack_conn_id and token is required.
CloudFirestoreExportDatabaseOperator
Exports a copy of all or a subset of documents from Google Cloud Firestore to another storage system, such as Google Cloud Storage.
GoogleAnalyticsDeletePreviousDataUploadsOperator
Deletes previous GA uploads to leave the latest file to control the size of the Data Set Quota.
CloudDLPCreateStoredInfoTypeOperator
Creates a pre-built stored infoType to be used for inspection.
DayOfWeekSensor
Waits until the first specified day of the week. For example, if the execution day of the task is 2018-12-22 (Saturday) and you pass FRIDAY, the task will wait until next Friday.
GoogleDataprepHook
Hook for connection with Dataprep API. To get connection Dataprep with Airflow you need Dataprep token. clouddataprep.com/documentation/api#section/Authentication
ComputeEngineCopyInstanceTemplateOperator
Copies the instance template, applying specified changes.
AwsSnsHook
This hook is deprecated. Please use airflow.providers.amazon.aws.hooks.sns.SnsHook.
StepFunctionExecutionSensor
Asks for the state of the Step Function State Machine Execution until it reaches a failure state or success state. If it fails, failing the task.
CloudDataTransferServiceS3ToGCSOperator
Synchronizes an S3 bucket with a Google Cloud Storage bucket using the Google Cloud Storage Transfer Service.
CloudDatastoreImportEntitiesOperator
Import entities from Cloud Storage to Google Cloud Datastore
StackdriverListAlertPoliciesOperator
Fetches all the Alert Policies identified by the filter passed as filter parameter. The desired return type can be specified by the format parameter, the supported formats are “dict”, “json” and None which returns python dictionary, stringified JSON and protobuf respectively.
OracleToAzureDataLakeOperator
Moves data from Oracle to Azure Data Lake. The operator runs the query against Oracle and stores the file locally before loading it into Azure Data Lake.
DatadogHook
Uses datadog API to send metrics of practically anything measurable, so its possible to track
AwsGlueCatalogHook
This hook is deprecated. Please use airflow.providers.amazon.aws.hooks.glue_catalog.GlueCatalogHook.
SageMakerEndpointSensor
Asks for the state of the endpoint state until it reaches a terminal state. If it fails the sensor errors, the task fails.
JiraOperator
JiraOperator to interact and perform action on Jira issue tracking system. This operator is designed to use Jira Python SDK.
GoogleDisplayVideo360SDFtoGCSOperator
Download SDF media and save it in the Google Cloud Storage.
BigtableCreateInstanceOperator
Creates a new Cloud Bigtable instance. If the Cloud Bigtable instance with the given ID exists, the operator does not compare its configuration and immediately succeeds. No changes are made to the existing instance.
WasbTaskHandler
WasbTaskHandler is a python log handler that handles and reads task instance logs. It extends airflow FileTaskHandler and uploads to and reads from Wasb remote storage.
SqlSensor
Runs a sql statement repeatedly until a criteria is met. It will keep trying until success or failure criteria are met, or if the first cell is not in (0, 0, , None). Optional success and failure callables are called with the first cell returned as the argument. If success callable is defined the sensor will keep retrying until the criteria is met. If failure callable is defined and the criteria is met the sensor will raise AirflowException. Failure criteria is evaluated before success criteria. A fail_on_empty boolean can also be passed to the sensor in which case it will fail if no rows have been returned
GoogleDisplayVideo360DeleteReportOperator
Deletes a stored query as well as the associated stored reports.
OpsgenieAlertHook
This hook allows you to post alerts to Opsgenie. Accepts a connection that has an Opsgenie API key as the connections password. This hook sets the domain to conn_id.host, and if not set will default to api.opsgenie.com.
StackdriverEnableNotificationChannelsOperator
Enables one or more disabled alerting policies identified by filter parameter. Inoperative in case the policy is already enabled.
SageMakerTrainingSensor
Asks for the state of the training state until it reaches a terminal state. If it fails the sensor errors, failing the task.
CloudDataCatalogUpdateTagTemplateFieldOperator
Updates a field in a tag template. This method cannot be used to update the field type.
CloudDLPCreateInspectTemplateOperator
Creates an InspectTemplate for re-using frequently used configuration for inspecting content, images, and storage.
DatadogSensor
A sensor to listen, with a filter, to datadog event streams and determine if some event was emitted.
StackdriverDisableAlertPoliciesOperator
Disables one or more enabled alerting policies identified by filter parameter. Inoperative in case the policy is already disabled.
CloudDataCatalogDeleteTagTemplateOperator
Deletes a tag template and all tags using the template.
AzureFileShareToGCSOperator
Synchronizes a Azure FileShare directory content (excluding subdirectories), possibly filtered by a prefix, with a Google Cloud Storage destination path.
ComputeEngineInstanceGroupUpdateManagerTemplateOperator
Patches the Instance Group Manager, replacing source template URL with the destination one. API V1 does not have update/patch operations for Instance Group Manager, so you must use beta or newer API version. Beta is the default.
CloudFormationCreateStackSensor
Waits for a stack to be created successfully on AWS CloudFormation.
CloudDataTransferServiceJobStatusSensor
Waits for at least one operation belonging to the job to have the expected status.
LocalFilesystemToGCSOperator
Uploads a file or list of files to Google Cloud Storage. Optionally can compress the file for upload.
AzureBlobStorageToGCSOperator
Operator transfers data from Azure Blob Storage to specified bucket in Google Cloud Storage
AWSCloudFormationHook
This hook is deprecated. Please use airflow.providers.amazon.aws.hooks.cloud_formation.CloudFormationHook.
AwsFirehoseHook
This hook is deprecated. Please use airflow.providers.amazon.aws.hooks.kinesis.FirehoseHook.
GoogleCalendarHook
Interact with Google Calendar via Google Cloud connection Reading and writing cells in Google Sheet
BashSensor
Executes a bash command/script and returns True if and only if the return code is 0.
DataflowJobAutoScalingEventsSensor
Checks for the job autoscaling event in Google Cloud Dataflow.
HiveStatsCollectionOperator
Gathers partition statistics using a dynamically generated Presto query, inserts the stats into a MySql table with this format. Stats overwrite themselves if you rerun the same date/partition.
DingdingOperator
This operator allows you send Dingding message using Dingding custom bot. Get Dingding token from conn_id.password. And prefer set domain to conn_id.host, if not will use default oapi.dingtalk.com.
CloudDatastoreExportEntitiesOperator
Export entities from Google Cloud Datastore to Cloud Storage
SFTPOperator
SFTPOperator for transferring files from remote host to local or vice a versa. This operator uses ssh_hook to open sftp transport channel that serve as basis for file transfer.
CloudDataTransferServiceResumeOperationOperator
Resumes a transfer operation in Google Storage Transfer Service.
CloudFunctionInvokeFunctionOperator
Invokes a deployed Cloud Function. To be used for testing purposes as very limited traffic is allowed.
SparkJDBCHook
This hook extends the SparkSubmitHook specifically for performing data transfers to/from JDBC-based databases with Apache Spark.
AWSDataSyncHook
This hook is deprecated. Please use airflow.providers.amazon.aws.hooks.datasync.DataSyncHook.
CloudDLPListInfoTypesOperator
Returns a list of the sensitive information types that the DLP API supports.
PinotAdminHook
This hook is a wrapper around the pinot-admin.sh script. For now, only small subset of its subcommands are implemented, which are required to ingest offline data into Apache Pinot (i.e., AddSchema, AddTable, CreateSegment, and UploadSegment). Their command options are based on Pinot v0.1.0.
AutoMLDeployModelOperator
Deploys a model. If a model is already deployed, deploying it with the same parameters has no effect. Deploying with different parameters (as e.g. changing node_number) will reset the deployment state without pausing the model_ids availability.
StackdriverListNotificationChannelsOperator
Fetches all the Notification Channels identified by the filter passed as filter parameter. The desired return type can be specified by the format parameter, the supported formats are “dict”, “json” and None which returns python dictionary, stringified JSON and protobuf respectively.
AirbyteTriggerSyncOperator
This operator allows you to submit a job to an Airbyte server to run a integration process between your source and destination.
CloudTasksQueueDeleteOperator
Deletes a queue from Cloud Tasks, even if it has tasks in it.
BigQueryToMySqlOperator
Fetches the data from a BigQuery table (alternatively fetch data for selected columns) and insert that data into a MySQL table.
GoogleAnalyticsGetAdsLinkOperator
Returns a web property-Google Ads link to which the user has access.
S3ToHiveOperator
Moves data from S3 to Hive. The operator downloads a file from S3, stores the file locally before loading it into a Hive table. If the create or recreate arguments are set to True, a CREATE TABLE and DROP TABLE statements are generated. Hive data types are inferred from the cursors metadata from.
CloudDataCatalogDeleteTagTemplateFieldOperator
Deletes a field in a tag template and all uses of that field.
LocalToAzureDataLakeStorageOperator
This class is deprecated. Please use airflow.providers.microsoft.azure.transfers.local_to_adls.LocalFilesystemToADLSOperator.
AwsGlueJobHook
This hook is deprecated. Please use airflow.providers.amazon.aws.hooks.glue.GlueJobHook.
MetastorePartitionSensor
An alternative to the HivePartitionSensor that talk directly to the MySQL db. This was created as a result of observing sub optimal queries generated by the Metastore thrift service when hitting subpartitioned tables. The Thrift services queries were written in a way that wouldnt leverage the indexes.
DatastoreHook
Interact with Google Cloud Datastore. This hook uses the Google Cloud connection.
GlacierJobOperationSensor
Glacier sensor for checking job state. This operator runs only in reschedule mode.
OSSKeySensor
Waits for a key (a file-like instance on OSS) to be present in a OSS bucket. OSS being a key/value it does not support folders. The path is just a key a resource.
CloudNaturalLanguageAnalyzeEntitySentimentOperator
Finds entities, similar to AnalyzeEntities in the text and analyzes sentiment associated with each entity and its mentions.
SparkJDBCOperator
This operator extends the SparkSubmitOperator specifically for performing data transfers to/from JDBC-based databases with Apache Spark. As with the SparkSubmitOperator, it assumes that the “spark-submit” binary is available on the PATH.
CloudTextToSpeechSynthesizeOperator
Synthesizes text to speech and stores it in Google Cloud Storage
BigQueryToMsSqlOperator
Fetches the data from a BigQuery table (alternatively fetch data for selected columns) and insert that data into a MSSQL table.
AwsBatchClientHook
This hook is deprecated. Please use airflow.providers.amazon.aws.hooks.batch.BatchClientHook.
TimeDeltaSensor
Waits for a timedelta after the tasks execution_date + schedule_interval. In Airflow, the daily task stamped with execution_date 2016-01-01 can only start running on 2016-01-02. The timedelta here represents the time after the execution period has closed.
SageMakerTransformSensor
Asks for the state of the transform state until it reaches a terminal state. The sensor will error if the job errors, throwing a AirflowException containing the failure reason.
SystemsManagerParameterStoreBackend
Retrieves Connection or Variables from AWS SSM Parameter Store
TelegramOperator
This operator allows you to post messages to Telegram using Telegram Bot API. Takes both Telegram Bot API token directly or connection that has Telegram token in password field. If both supplied, token parameter will be given precedence.
AwsGlueCrawlerHook
This hook is deprecated. Please use airflow.providers.amazon.aws.hooks.glue_crawler.GlueCrawlerHook.
CloudDataTransferServiceDeleteJobOperator
Delete a transfer job. This is a soft delete. After a transfer job is deleted, the job and all the transfer executions are subject to garbage collection. Transfer jobs become eligible for garbage collection 30 days after soft delete.
CloudVideoIntelligenceDetectVideoExplicitContentOperator
Performs video annotation, annotating explicit content.
GoogleAnalyticsModifyFileHeadersDataImportOperator
GA has a very particular naming convention for Data Import.
SFTPToS3Operator
This operator enables the transferring of files from a SFTP server to Amazon S3.
CloudDatastoreBeginTransactionOperator
Begins a new transaction. Returns a transaction handle.
WorkflowsListExecutionsOperator
Returns a list of executions which belong to the workflow with the given name. The method returns executions of all workflow revisions. Returned executions are ordered by their start time (newest first).
CloudFunctionDeployFunctionOperator
Creates a function in Google Cloud Functions. If a function with this name already exists, it will be updated.
GoogleAnalyticsDataImportUploadOperator
Take a file from Cloud Storage and uploads it to GA via data import API.
GCSToGoogleDriveOperator
Copies objects from a Google Cloud Storage service to a Google Drive service, with renaming if requested.
WinRMOperator
WinRMOperator to execute commands on given remote host using the winrm_hook.
DiscordWebhookHook
This hook allows you to post messages to Discord using incoming webhooks. Takes a Discord connection ID with a default relative webhook endpoint. The default endpoint can be overridden using the webhook_endpoint parameter (discordapp.com/developers/docs/resources/webhook).
GCSObjectsWithPrefixExistenceSensor
Checks for the existence of GCS objects at a given prefix, passing matches via XCom.
BigtableDeleteInstanceOperator
Deletes the Cloud Bigtable instance, including its clusters and all related tables.
CloudDataTransferServiceCancelOperationOperator
Cancels a transfer operation in Google Storage Transfer Service.
ComputeEngineBaseOperator
Abstract base operator for Google Compute Engine operators to inherit from.
CloudDatastoreRunQueryOperator
Run a query for entities. Returns the batch of query results.
CloudFormationDeleteStackSensor
Waits for a stack to be deleted successfully on AWS CloudFormation.
SparkSubmitOperator
This hook is a wrapper around the spark-submit binary to kick off a spark-submit job. It requires that the “spark-submit” binary is in the PATH or the spark-home is set in the extra on the connection.
CloudDataFusionPipelineStateSensor
Check the status of the pipeline in the Google Cloud Data Fusion
WorkflowsUpdateWorkflowOperator
Updates an existing workflow. Running this method has no impact on already running executions of the workflow. A new revision of the workflow may be created as a result of a successful update operation. In that case, such revision will be used in new workflow executions.
DingdingHook
This hook allows you send Dingding message using Dingding custom bot. Get Dingding token from conn_id.password. And prefer set domain to conn_id.host, if not will use default oapi.dingtalk.com.
BranchDateTimeOperator
Branches into one of two lists of tasks depending on the current datetime. For more information on how to use this operator, take a look at the guide. BranchDateTimeOperator
MongoSensor
Checks for the existence of a document which matches the given query in MongoDB.
AsanaFindTaskOperator
This operator can be used to retrieve Asana tasks that match various filters. See developers.asana.com/docs/update-a-task for a list of possible filters.
StackdriverUpsertAlertOperator
Creates a new alert or updates an existing policy identified the name field in the alerts parameter.
StackdriverUpsertNotificationChannelOperator
Creates a new notification or updates an existing notification channel identified the name field in the alerts parameter.
WorkflowExecutionSensor
Checks state of an execution for the given workflow_id and execution_id.
DockerSwarmOperator
Execute a command as an ephemeral docker swarm service. Example use-case - Using Docker Swarm orchestration to make one-time scripts highly available.
GSheetsHook
Interact with Google Sheets via Google Cloud connection Reading and writing cells in Google Sheet. developers.google.com/sheets/api/guides/values
SparkSqlHook
This hook is a wrapper around the spark-sql binary. It requires that the “spark-sql” binary is in the PATH.
WorkflowsDeleteWorkflowOperator
Deletes a workflow with the specified name. This method also cancels and deletes all running executions of the workflow.
WorkflowsGetExecutionOperator
Returns an execution for the given workflow_id and execution_id.
TableauOperator
Execute a Tableau API Resource tableau.github.io/server-client-python/docs/api-ref
StepFunctionStartExecutionOperator
An Operator that begins execution of an Step Function State Machine
TableauRefreshWorkbookOperator
This operator is deprecated. Please use airflow.providers.tableau.operators.tableau.
CloudDataTransferServiceListOperationsOperator
Lists long-running operations in Google Storage Transfer Service that match the specified filter.
CloudVideoIntelligenceDetectVideoLabelsOperator
Performs video annotation, annotating video labels.
AWSAthenaOperator
This operator is deprecated. Please use airflow.providers.amazon.aws.operators.athena.AthenaOperator.
MSSQLToGCSOperator
Copy data from Microsoft SQL Server to Google Cloud Storage in JSON or CSV format.
CloudDLPCreateDeidentifyTemplateOperator
Creates a DeidentifyTemplate for re-using frequently used configuration for de-identifying content, images, and storage.
CloudDataTransferServicePauseOperationOperator
Pauses a transfer operation in Google Storage Transfer Service.
DataprepGetJobsForJobGroupOperator
Get information about the batch jobs within a Cloud Dataprep job. API documentation clouddataprep.com/documentation/api#section/Overview
AzureCosmosInsertDocumentOperator
Inserts a new document into the specified Cosmos database and collection It will create both the database and collection if they do not already exist
WorkflowsCreateExecutionOperator
Creates a new execution using the latest revision of the given workflow.
AwsGlueCatalogPartitionSensor
This sensor is deprecated. Please use airflow.providers.amazon.aws.sensors.glue_catalog_partition.GlueCatalogPartitionSensor.
CloudTasksQueuePurgeOperator
Purges a queue by deleting all of its tasks from Cloud Tasks.
GCSUploadSessionCompleteSensor
Checks for changes in the number of objects at prefix in Google Cloud Storage bucket and returns True if the inactivity period has passed with no increase in the number of objects. Note, this sensor will no behave correctly in reschedule mode, as the state of the listed objects in the GCS bucket will be lost between rescheduled invocations.
AwsDynamoDBHook
This class is deprecated. Please use airflow.providers.amazon.aws.hooks.dynamodb.DynamoDBHook.
TelegramHook
This hook allows you to post messages to Telegram using the telegram python-telegram-bot library.
CloudVideoIntelligenceDetectVideoShotsOperator
Performs video annotation, annotating video shots.
SageMakerTuningSensor
Asks for the state of the tuning state until it reaches a terminal state. The sensor will error if the job errors, throwing a AirflowException containing the failure reason.
StackdriverEnableAlertPoliciesOperator
Enables one or more disabled alerting policies identified by filter parameter. Inoperative in case the policy is already enabled.
KylinCubeOperator
This operator is used to submit request about kylin build/refresh/merge, and can track job status . so users can easier to build kylin job
CloudDatastoreCommitOperator
Commit a transaction, optionally creating, deleting or modifying some entities.
SubDagOperator
This runs a sub dag. By convention, a sub dags dag_id should be prefixed by its parent and a dot. As in parent.child. Although SubDagOperator can occupy a pool/concurrency slot, user can specify the mode=reschedule so that the slot will be released periodically to avoid potential deadlock.
CloudNaturalLanguageAnalyzeEntitiesOperator
Finds named entities in the text along with entity types, salience, mentions for each entity, and other properties.
SlackWebhookHook
This hook allows you to post messages to Slack using incoming webhooks. Takes both Slack webhook token directly and connection that has Slack webhook token. If both supplied, http_conn_id will be used as base_url, and webhook_token will be taken as endpoint, the relative path of the url.
GoogleAdsToGcsOperator
Fetches the daily results from the Google Ads API for 1-n clients Converts and saves the data as a temporary CSV file Uploads the CSV to Google Cloud Storage
CloudwatchTaskHandler
CloudwatchTaskHandler is a python log handler that handles and reads task instance logs.
CloudDLPUpdateStoredInfoTypeOperator
Updates the stored infoType by creating a new version.
CeleryQueueSensor
Waits for a Celery queue to be empty. By default, in order to be considered empty, the queue must not have any tasks in the reserved, scheduled or active states.
DiscordWebhookOperator
This operator allows you to post messages to Discord using incoming webhooks. Takes a Discord connection ID with a default relative webhook endpoint. The default endpoint can be overridden using the webhook_endpoint parameter (discordapp.com/developers/docs/resources/webhook).
FTPToS3Operator
This operator enables the transfer of files from a FTP server to S3. It can be used to transfer one or multiple files.
MongoHook
Interact with Mongo. This hook uses the Mongo conn_id. PyMongo Wrapper to Interact With Mongo Database Mongo Connection Documentation docs.mongodb.com/manual/reference/connection-string/index.html You can specify connection string options in extra field of your connection docs.mongodb.com/manual/reference/connection-string/index.html#connection-string-options
GoogleApiToS3Operator
Basic class for transferring data from a Google API endpoint into a S3 Bucket.
CloudDataTransferServiceGetOperationOperator
Gets the latest state of a long-running operation in Google Storage Transfer Service.
AzureKeyVaultBackend
Retrieves Airflow Connections or Variables from Azure Key Vault secrets.
FivetranHook
Fivetran API interaction hook. :param fivetran_conn_id Conn ID of the Connection to be used to
DummyOperator
Operator that does literally nothing. It can be used to group tasks in a DAG.
OpsgenieAlertOperator
This operator is deprecated. Please use airflow.providers.opsgenie.operators.opsgenie.OpsgenieCreateAlertOperator.
LatestOnlyOperator
Allows a workflow to skip tasks that are not running during the most recent schedule interval.
OpsgenieAlertHook
This hook allows you to post alerts to Opsgenie. Accepts a connection that has an Opsgenie API key as the connections password. This hook sets the domain to conn_id.host, and if not set will default to api.opsgenie.com.
SageMakerTransformSensor
Asks for the state of the transform state until it reaches a terminal state. The sensor will error if the job errors, throwing a AirflowException containing the failure reason.
SageMakerTuningSensor
Asks for the state of the tuning state until it reaches a terminal state. The sensor will error if the job errors, throwing a AirflowException containing the failure reason.
SageMakerBaseSensor
Contains general sensor behavior for SageMaker. Subclasses should implement get_sagemaker_response() and state_from_response() methods. Subclasses should also implement NON_TERMINAL_STATES and FAILED_STATE methods.
SageMakerTrainingSensor
Asks for the state of the training state until it reaches a terminal state. If it fails the sensor errors, failing the task.
SageMakerEndpointSensor
Asks for the state of the endpoint state until it reaches a terminal state. If it fails the sensor errors, the task fails.
EMRContainerHook
This class is deprecated. Please use airflow.providers.amazon.aws.hooks.emr.EmrContainerHook.
EMRContainerOperator
This class is deprecated. Please use airflow.providers.amazon.aws.operators.emr.EmrContainerOperator.
EmrStepSensor
Asks for the state of the step until it reaches any of the target states. If it fails the sensor errors, failing the task.
EmrJobFlowSensor
Asks for the state of the EMR JobFlow (Cluster) until it reaches any of the target states. If it fails the sensor errors, failing the task.
EMRContainerSensor
This class is deprecated. Please use airflow.providers.amazon.aws.sensors.emr.EmrContainerSensor.
dataframe
Convert a Table object into a Pandas DataFrame or persist a DataFrame result to a database table.
merge
Merge data into an existing table in situations where there may be conflicts. This function adds data to a table with either an "update" or "ignore" strategy. The "ignore" strategy does not add values that conflict, while the "update" strategy overwrites the older values.
save_file
Export a database table as a CSV or Parquet file to local storage, Amazon S3, or Google Cloud Storage.
append
Append the results of a source table onto a target table. This function assumes there are no conflicts between the schemas of both tables.
TempTable
A metadata representation of database table within the Astro ecosystem that will not be persisted after corresponding operations are complete.
Table
A metadata representation of an existing or to-be-created database table within the Astro ecosystem.
transform_file
Execute a SELECT SQL statement contained in a file. Data returned from this SQL is inserted into a temporary table which can used by other downstream tasks.
transform
Execute an explicit, SELECT SQL statement. Data returned from this SQL is inserted into a temporary table which can used by other downstream tasks.
load_file
Load CSV or Parquet files from local storage, Amazon S3, or Google Cloud Storage into a SQL database.
aggregate_check
Validate the result from a SQL which performs an aggregation matches an expected value or falls within a provided range.
S3DeleteObjectsOperator
To enable users to delete single object or multiple objects from a bucket using a single HTTP request.
S3KeySizeSensor
Waits for a key (a file-like instance on S3) to be present and be more than some size in a S3 bucket. S3 being a key/value it does not support folders. The path is just a key a resource.
S3PrefixSensor
Waits for a prefix or all prefixes to exist. A prefix is the first part of a key, thus enabling checking of constructs similar to glob airfl* or SQL LIKE airfl%. There is the possibility to precise a delimiter to indicate the hierarchy or keys, meaning that the match will stop at that delimiter. Current code accepts sane delimiters, i.e. characters that are NOT special characters in the Python regex engine.
S3FileTransformOperator
Copies data from a source S3 location to a temporary location on the local filesystem. Runs a transformation on this file as specified by the transformation script and uploads the output to a destination S3 location.
S3KeysUnchangedSensor
Checks for changes in the number of objects at prefix in AWS S3 bucket and returns True if the inactivity period has passed with no increase in the number of objects. Note, this sensor will not behave correctly in reschedule mode, as the state of the listed objects in the S3 bucket will be lost between rescheduled invocations.
S3KeySensor
Waits for a key (a file-like instance on S3) to be present in a S3 bucket. S3 being a key/value it does not support folders. The path is just a key a resource.
S3ListPrefixesOperator
List all subfolders from the bucket with the given string prefix in name.
StepFunctionStartExecutionOperator
An Operator that begins execution of an Step Function State Machine
StepFunctionExecutionSensor
Asks for the state of the Step Function State Machine Execution until it reaches a failure state or success state. If it fails, failing the task.
StepFunctionGetExecutionOutputOperator
An Operator that begins execution of an Step Function State Machine
EC2InstanceStateSensor
Check the state of the AWS EC2 instance until state of the instance become equal to the target state.
PlexusHook
Used for jwt token generation and storage to make Plexus API calls. Requires email and password Airflow variables be created.
AzureDataLakeStorageDeleteOperator
This class is deprecated. Please use airflow.providers.microsoft.azure.operators.adls.ADLSDeleteOperator.
AzureDataLakeStorageListOperator
This class is deprecated. Please use airflow.providers.microsoft.azure.operators.adls.ADLSListOperator.
HightouchTriggerSyncOperator
This operator triggers a run for a specified Sync in Hightouch via the Hightouch API.
DatahubRestHook
Creates a DataHub Rest API connection used to send metadata to DataHub. Takes the endpoint for your DataHub Rest API in the Server Endpoint(host) field.
DatahubEmitterOperator
Emits a Metadata Change Event to DataHub using either a DataHub Rest or Kafka connection.
DatahubGenericHook
Emits Metadata Change Events using either the DatahubRestHook or the DatahubKafkaHook. Set up a DataHub Rest or Kafka connection to use.
DatahubBaseOperator
The DatahubBaseOperator is used as a base operator all DataHub operators.
DatahubKafkaHook
Creates a DataHub Kafka connection used to send metadata to DataHub. Takes your kafka broker in the Kafka Broker(host) field.