Table of Contents

What is Deep Classiflie?

Project Motivation

Model Exploration

The best way to start understanding/exploring the current model is to use the explorers on

Prediction Explorer

Explore randomly sampled predictions from the test set of the latest model incarnation. The explorer uses captum’s implementation of integrated gradients7 to visualize attributions of statement predictions to tokens in each statement. Read more about explorer below.

prediction explorer

Performance Explorer

Explore the performance of the current model incarnation using confusion matrices oriented along temporal and confidence-based axes.

temporal performance explorer confidence bucket performance explorer

Current Predictions Explorer

Explore current predictions of the latest model. The most recent (max 5000) statements that have yet to be labeled by the currently used fact-checking sources (only Washington Post Factchecker at present) are available.

Live predictions are continuously added via ipfs. Twitter is polled for new statements every 3 minutes, every 15 minutes.

This explorer provides fact-checkers a means (one of many possible) of using current model predictions and may also help those building fact-checking systems evaluate the potential utility of integrating similar models into their systems.

current predictions explorer

Core Components

The entire initial Deep Classiflie system (raw dataset, model, analytics modules, twitter bot etc.) can be built from scratch using the publicly available code here.2

Component Description
deep_classiflie Core framework for building, training and analyzing fact-check facilitating ML models. Can operate independently from deep_classiflie_db when training a model using existing dataset collections or when performing inference. Depends on deep_classiflie_db for certain functions such as creating new dataset collections, running the tweetbot, running the analytics modules etc. 3
deep_classiflie_db Backend datastore for managing Deep Classiflie metadata, analyzing Deep Classiflie intermediate datasets and orchestrating Deep Classiflie model training pipelines. Includes data scraping modules for the initial model data sources (twitter,, washington post – politifact and the toronto star were removed from an earlier version and may be re-added among others as models for other prominent politicians are explored)
Dataset Generation
Model Training
Analysis & Reporting

Current Performance


Global metrics9 summarized in the table below relate to the current model’s performance on a test set comprised of ~13K statements made between 2020-04-03 and 2020-07-08:
Global Stat Summary


To minimize false positives and maximize the model’s utility, the following approach is used to issue high-confidence predictions:

  1. All test set predictions are bucketed by model confidence (derived from the raw prediction sigmoid output).
  2. Various performance metrics are calculated, grouped by confidence bucket (4%/10% of test set for non-tweets/tweets respectively). Most relevantly:
    • PPV
    • Positive prediction ratio: (bucket true positives + bucket false positives)/#statements in bucket
    • Bucket-level accuracy
  3. Report estimated local accuracy metrics of given prediction by associating it with its corresponding confidence bucket. See caveats regarding recognized performance biasesa
    • In the prediction explorer, randomly sample 100 statements (including all confusion matrix classes) from each of four confidence buckets: the maximum and minimum accuracy buckets for each statement type.
      Max PPV Non-Tweets
      Max PPV Tweets

Noteworthy Features

Dataset generation
Model training
Analysis & reporting

Data Pipeline

To conserve resources and for POC research expediency, the current pipeline uses a local relational DB (MariaDB). Ultimately, a distributed data store would be preferable and warranted if this project merits sufficient interest from the community or a POC involving a distributed network of models is initiated.

Deep Classiflie Data Pipeline

Deep Classiflie Data Pipeline

False Statement Filter Processes

False Statement Filter Processes

Distribution Convergence Process

Distribution Convergence Process

Dataset Generation Process

Dataset Generation Process


The parameters used in all Deep Classiflie job executions related to the development of the POC model are provided in the configs directory

Config File Description
config_defaults.yaml default values and descriptions of all non-sql parameters
config_defaults_sql.yaml default values and descriptions of all sql parameters
dataprep_only.yaml parameters used to generate dataset
train_albertbase.yaml parameters used to recursively train the POC model
gen_swa_ckpt.yaml parameters used to generate an swa checkpoint (current release was built using swa torchcontrib module but will switch to the now-integrated pytorch swa api in the next release)
gen_report.yaml parameters used to generate model analysis report(s)
gen_dashboards.yaml parameters used to generate model analysis dashboards
cust_predict.yaml parameters used to perform model inference on arbitrary input statements
tweetbot.yaml parameters used to run the tweetbot behind @DeepClassiflie
infsvc.yaml parameters used to run the inference service behind the current prediction explorer

Further Research

Model Replication


N.B. before you begin, the core external dependency is admin access to a mariadb or mysql DB

  1. Clone deep_classiflie and deep_classiflie_db (make them peer directories if you want to minimize configuration)
     git clone
     git clone
  2. install conda if necessary. Then create and activate deep_classiflie virtual env:
     conda env create -f ./deep_classiflie/assets/deep_classiflie.yml
     conda activate deep_classiflie
  3. clone captum and HuggingFace’s transformers repos. Install transformers binaries.:
     git clone
     git clone
     cd transformers
     pip install .
  4. Install mariadb or mysql DB if necessary.

  5. These are the relevant DB configuration settings used for the current release of Deep Classiflie’s backend. Divergence from this configuration has not been tested and may result in unexpected behavior.

     collation-server = utf8mb4_unicode_ci
     init-connect='SET NAMES utf8mb4'
     character-set-server = utf8mb4
     transaction-isolation = READ-COMMITTED
  6. copy/update relevant Deep Classiflie config file to $HOME dir
     cp ./deep_classiflie_db/db_setup/.dc_config.example ~
     mv .dc_config.example .dc_config
     vi .dc_config
     # configure values appropriate to your environment and move to $HOME
     # Sorry I haven't had a chance to write a setup config script for this yet...
     export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"
     export CUDA_HOME=/usr/local/cuda
     export PYTHONPATH="${PYTHONPATH}:${HOME}/repos/edification/deep_classiflie:${HOME}/repos/captum:${HOME}/repos/transformers:${HOME}/repos/edification/deep_classiflie_db"
     export DC_BASE="$HOME/repos/edification/deep_classiflie"
     export DCDB_BASE="$HOME/repos/edification/deep_classiflie_db"
     export DCDB_PASS="dcbotpasshere"
     export DCDB_USER="dcbot"
     export DCDB_HOST="hostgoeshere"
     export DCDB_NAME="deep_classiflie"
  7. execute Deep Classiflie DB backend initialization script:
     cd deep_classiflie_db/db_setup
     ./ deep_classiflie

    Ensure you have access to a DB user with administrator privs. “admin” in the case above. Deep Classiflie logo

  8. login to the backend db and seed historical tweets (necessary as only most recent 3200 can currently be retrieved directly from twitter)
     mysql -u dcbot -p
     use deep_classiflie
     source dcbot_tweets_init_20200910.sql
  9. copy over relevant base model weights to specified model_cache_dir:
     # model_cache_dir default found in configs/config_defaults.yaml
     # it defaults to $HOME/datasets/model_cache/deep_classiflie/
     cd {PATH_TO_DEEP_CLASSIFLIE_BASE}/deep_classiflie/assets/
     cp albert-base-v2-pytorch_model.bin albert-base-v2-spiece.model {MODEL_CACHE_DIR}/
  10. Run with the provided config necessary to download the raw data from the relevant data sources (, twitter, washington post), execute the data processing pipeline and generate the dataset collection.
    cd deep_classiflie
    ./ --config "{PATH_TO_DEEP_CLASSIFLIE_BASE}/configs/dataprep_only.yaml"
    See relevant process diagrams to better understand the dataset generation pipeline and process.
    • While I have set seeds for the majority of randomized processes in the data pipeline, there are a couple points in the pipeline that remain non-deterministic at the moment (see issue #). As such, the dataset generation log messages should approximate those below, but variation within 1% is expected.
    (...lots of initial data download/parsing message above...) 
    2020-08-14 16:55:22,165:deep_classiflie:INFO: Proceeding with uninitialized base model to generate dist-based duplicate filter
    2020-08-14 16:55:22,501:deep_classiflie:INFO: Predictions from model weights: 
    2020-08-14 16:57:14,215:deep_classiflie:INFO: Generated 385220 candidates for false truth analysis
    2020-08-14 16:57:15,143:deep_classiflie:INFO: Deleted 7073 'truths' from truths table based on similarity with falsehoods enumerated in base_false_truth_del_cands
    2020-08-14 16:57:30,181:deep_classiflie:INFO: saved 50873 rows of a transformed truth distribution to db
    2020-08-14 16:57:30,192:deep_classiflie:DEBUG: DB connection obtained: <mysql.connector.pooling.PooledMySQLConnection object at 0x7f8216056e50>
    2020-08-14 16:57:30,220:deep_classiflie:DEBUG: DB connection closed: <mysql.connector.pooling.PooledMySQLConnection object at 0x7f8216056e50>
    2020-08-14 16:57:30,221:deep_classiflie:INFO: Building a balanced dataset from the following raw class data:
    2020-08-14 16:57:30,221:deep_classiflie:INFO: Label True: 50873 records
    2020-08-14 16:57:30,221:deep_classiflie:INFO: Label False: 19261 records
    2020-08-14 16:57:49,281:deep_classiflie:INFO: Saving features into cached file /home/speediedan/datasets/temp/deep_classiflie/train_converged_filtered.pkl
    2020-08-14 16:58:06,552:deep_classiflie:INFO: Saving features into cached file /home/speediedan/datasets/temp/deep_classiflie/val_converged_filtered.pkl
    2020-08-14 16:58:11,714:deep_classiflie:INFO: Saving features into cached file /home/speediedan/datasets/temp/deep_classiflie/test_converged_filtered.pkl
    2020-08-14 16:58:14,331:deep_classiflie:DEBUG: Metadata update complete, 1 record(s) affected.
  11. Recursively train the deep classiflie POC model:
    cd deep_classiflie
    ./ --config "{PATH_TO_DEEP_CLASSIFLIE_BASE}/configs/train_albertbase.yaml"
  12. Generate an swa checkpoint (current release was built using swa torchcontrib module but will switch to the now-integrated pytorch swa api in the next release):
    cd deep_classiflie
    ./ --config "{PATH_TO_DEEP_CLASSIFLIE_BASE}/configs/gen_swa_ckpt.yaml"
  13. Generate model analysis report(s) using the generated swa checkpoint:
    # NOTE, swa checkpoint generated in previous step must be added to gen_report.yaml
    cd deep_classiflie
    ./ --config "{PATH_TO_DEEP_CLASSIFLIE_BASE}/configs/gen_report.yaml"
  14. Generate model analysis dashboards:
    # NOTE, swa checkpoint generated in previous step must be added to gen_dashboards.yaml
    cd deep_classiflie
    ./ --config "{PATH_TO_DEEP_CLASSIFLIE_BASE}/configs/gen_dashboards.yaml"
  15. configure jekyll static site generator to use bokeh dashboards locally:

    sudo apt-get install ruby-full build-essential zlib1g-dev
    #add ruby gems to user profile
    echo '# Install Ruby Gems to ~/gems' >> ~/.bashrc
    echo 'export GEM_HOME="$HOME/gems"' >> ~/.bashrc
    echo 'export PATH="$HOME/gems/bin:$PATH"' >> ~/.bashrc
    source ~/.bashrc
    #install jekyll (ensure you're in the build dir (docs))
    gem install jekyll bundler
    #to get nokogiri to install, you may need to be root
    sudo gem install nokogiri
    #vi ./deep_classiflie/docs/Gemfile
    source ''
    gem 'nokogiri'
    gem 'rack', '~> 2.1.4'
    gem 'rspec'
    gem 'jekyll-theme-cayman'
    gem "github-pages", "~> 207", group: :jekyll_plugins
    gem "activesupport", ">="
    gem 'jekyll-sitemap'
    gem "kramdown", ">=2.3.0"
    #note if just updating components, best approach is to update all 
    bundle update
    #start local server from ./deep_classiflie/docs/
    cd ./deep_classiflie/docs/
    bundle exec jekyll serve

Model Replication and Exploration with Docker


As of writing (2020.10.11), Docker Compose does not fully support GPU provisioning so using the docker cli w/ –gpus flag here.

  1. Pull image from docker hub
     sudo docker pull speediedan/deep_classiflie:v0.1.3
  2. Recursively train model using latest dataset.
    • create a local directory to bind mount and use for exploring experiment output and start training container
      mkdir /tmp/docker_experiment_output
      sudo docker container run --rm -d --gpus all --mount type=bind,source=/tmp/docker_experiment_output,target=/experiments --name deep_classiflie_train deep_classiflie:v0.1.3  \
      conda run -n deep_classiflie python --config /home/deep_classiflie/repos/deep_classiflie/configs/docker_train_albertbase.yaml 
    • run tensorboard container to follow training progress (~6 hrs on a single GPU)
      sudo docker container run --rm -d --gpus all --mount type=bind,source=/tmp/docker_experiment_output,target=/experiments -p 6006:6006 --workdir /experiments/deep_classiflie/logs --name deep_classiflie_tb deep_classiflie:v0.1.3 conda run -n deep_classiflie tensorboard --host --logdir=/experiments/deep_classiflie/logs --reload_multifile=true
  3. Use a trained checkpoint to evaluate test performance
    • start the container with a local bind mount
      sudo docker container run --rm -it --gpus all --mount type=bind,source=/tmp/docker_experiment_output,target=/experiments --name deep_classiflie_explore deep_classiflie:v0.1.3 
    • update the docker_test_only.yaml file, passing the desired inference path (e.g. /experiments/deep_classiflie/checkpoints/20201010172113/
      vi configs/docker_test_only.yaml
      inference_ckpt: "/experiments/deep_classiflie/checkpoints/20201010172113/"
    • evaluate on test set
      conda run -n deep_classiflie python --config /home/deep_classiflie/repos/deep_classiflie/configs/docker_test_only.yaml
  4. Run custom predictions
    • update model checkpoint used for predictions with the one you trained
        vi /home/deep_classiflie/repos/deep_classiflie/configs/docker_cust_predict.yaml
        inference_ckpt: "/experiments/deep_classiflie/checkpoints/20201010172113/"
    • add tweets or statements to do inference/interpretation on as desired by modifying /home/deep_classiflie/datasets/explore_pred_interpretations.json
    • generate predictions
      conda run -n deep_classiflie python --config /home/deep_classiflie/repos/deep_classiflie/configs/docker_cust_predict.yaml --pred_inputs /home/deep_classiflie/datasets/explore_pred_interpretations.json
    • review prediction interpretation card in local host browser,
      chrome /tmp/docker_experiment_output/deep_classiflie/logs/20201011203013/inference_output/example_stmt_1_0.png


Citing Deep Classiflie

Please cite:

    author       = {Dan Dale},
    title        = {{Deep Classiflie: Shallow fact-checking with deep neural networks}},
    month        = sep,
    year         = 2020,
    doi          = {10.5281/zenodo.4046591},
    version      = {v0.1.3-alpha},
    publisher    = {Zenodo},
    url          = {}

Feel free to star the repo as well if you find it useful or interesting. Thanks!

References and Notes



View on GitHub