Thousands of Publicly Exposed MLflow Instances — A Hidden Risk in MLOps Infrastructure

Summary

MLflow is widely used by machine learning and data science teams to track experiments, store artifacts, manage model versions, and organize the machine learning lifecycle.

During internet-wide reconnaissance, I observed 25,000+ publicly accessible MLflow instances using FOFA, Shodan and other asset query tools.

Many of these instances appeared to be exposed without authentication, which means anyone on the internet may be able to access MLflow dashboards and APIs.

This can lead to exposure of machine learning experiments, model artifacts, datasets, internal project names, and model registry information.

This issue is not best described as a traditional MLflow software vulnerability. Instead, it is a deployment security and secure-by-default posture issue.

What is MLflow?

MLflow is an open-source platform used for managing the machine learning lifecycle.

It is commonly used for:

Tracking ML experiments
Storing metrics and parameters
Managing model artifacts
Registering models
Comparing training runs
Managing model versions
Supporting MLOps workflows

MLflow is used by data scientists, ML engineers, AI teams, research teams, and enterprise organizations to manage machine learning projects.

A typical MLflow setup contains:

MLflow Tracking Server
MLflow Web UI
Backend Store
Artifact Store
Model Registry

Because of this, MLflow often contains sensitive information such as models, datasets, experiment results, and internal ML workflows.

Description

During the research, many MLflow services were found to be publicly exposed on the internet without authentication.

MLflow provides authentication and security controls, but these protections need to be configured correctly by the operator.

By default, MLflow binds to 127.0.0.1, which means it is local-only. However, if someone starts the server with:

mlflow server --host 0.0.0.0

or exposes it through a public reverse proxy, cloud load balancer, Kubernetes ingress, or open firewall rule, the MLflow service may become accessible from the internet.

If authentication is not enabled, anyone can potentially access the MLflow UI and API endpoints.

Reconnaissance

The exposed MLflow instances were identified using public search engines.

FOFA Query

title="MLflow"

FOFA Link:

https://en.fofa.info/result?qbase64=dGl0bGU9Ik1MZmxvdyIg

Shodan Query

http.title:"MLflow"

Shodan Link:

https://www.shodan.io/search?query=http.title%3A%22MLflow%22

Total observed instances:

25000+

Note: This count should be treated as a point-in-time estimate based on public search results.

Proof of Concept

FOFA Result

FOFA results showing publicly accessible MLflow instances.

Shodan Result

Shodan results showing MLflow instances exposed on the internet.

MLflow Instance Example

Example of an exposed MLflow dashboard. Sensitive details should be redacted.

What Can an Attacker Do?

If an MLflow instance is exposed without authentication, an attacker may be able to:

View experiments
View run metadata
Access parameters and metrics
Identify internal project names
List artifacts
Access model files
View registered models
Create or delete runs
Modify experiment data
Register or overwrite models
Abuse artifact storage
Disrupt ML workflows

The exact impact depends on the MLflow configuration and backend storage permissions.

Example Read-Only Validation

A read-only request can be used to check whether the service is MLflow:

curl -i https://mlflow.example.com/version

Experiment search example:

curl -sS https://mlflow.example.com/api/2.0/mlflow/experiments/search \
  -H 'Content-Type: application/json' \
  -d '{"max_results":5}'

Run search example:

curl -sS https://mlflow.example.com/api/2.0/mlflow/runs/search \
  -H 'Content-Type: application/json' \
  -d '{"experiment_ids":["1"],"max_results":5}'

These examples should only be used on systems you own or are authorized to test.

Impact

The impact of exposed MLflow services can be serious.

Confidentiality Impact

An attacker may access:

ML experiments
Model artifacts
Training metadata
Dataset references
Internal project names
Model registry information
Cloud storage paths

This can lead to leakage of proprietary machine learning research and intellectual property.

Integrity Impact

An attacker may be able to:

Modify experiments
Create fake runs
Delete runs
Register unauthorized models
Tamper with model registry data

This can impact trust in ML workflows and model governance.

Availability Impact

An attacker may disrupt ML operations by:

Deleting runs
Removing artifacts
Breaking registry state
Polluting experiment history
Affecting downstream deployment workflows

CVSS Calculation

For a fully exposed MLflow instance with no authentication and accessible APIs, the severity can be considered Critical.

CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H

Score:

9.8 Critical

This may vary depending on the environment, artifact permissions, and whether additional access controls are present.

Important Clarification

This is not a single version-specific vulnerability.

MLflow already provides security features such as:

Localhost binding by default
Built-in authentication
Security middleware
Host validation controls
Reverse proxy deployment support

The real issue is that many deployments are exposed publicly without these protections enabled.

A better way to describe the issue is:

Publicly exposed unauthenticated MLflow deployments

rather than:

MLflow authentication bypass

Secure-by-Default Concern

Although MLflow provides authentication and security features, many users may still accidentally expose MLflow while testing or deploying quickly.

Some improvements that could reduce accidental exposure include:

Strong warning when using --host 0.0.0.0 without authentication
Security warning in startup logs
Explicit confirmation before public unauthenticated binding
More visible security guidance in quickstart documentation
Authentication enabled by default for non-local deployments
Explicit opt-out for unauthenticated public access

These changes could help new users avoid accidentally exposing sensitive ML infrastructure.

Mitigation

Organizations running MLflow should take the following actions.

1. Do Not Expose MLflow Directly to the Internet

Use:

VPN
Private network
Internal load balancer
IP allowlist
Zero Trust access
Private Kubernetes ingress

2. Enable Authentication

Use MLflow built-in authentication:

pip install 'mlflow[auth]'

export MLFLOW_FLASK_SERVER_SECRET_KEY='replace-with-strong-secret'

mlflow server \
  --host 127.0.0.1 \
  --port 5000 \
  --app-name basic-auth

3. Use Reverse Proxy Authentication

Place MLflow behind NGINX, OAuth2 Proxy, SSO, or an identity-aware gateway.

Example:

server {
    listen 443 ssl;
    server_name mlflow.example.com;

auth_basic "Restricted MLflow";
    auth_basic_user_file /etc/nginx/htpasswd;

location / {
        proxy_pass http://127.0.0.1:5000;
        proxy_set_header Host $host;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

4. Restrict Artifact Storage

Ensure artifact stores such as S3, GCS, Azure Blob, or local storage are not publicly accessible.

Use least-privilege permissions.

5. Monitor Logs

Monitor access to sensitive MLflow paths such as:

/version
/api/2.0/mlflow/experiments/search
/api/2.0/mlflow/runs/search
/api/2.0/mlflow/runs/get
/api/2.0/mlflow/runs/create
/api/2.0/mlflow/runs/delete
/api/2.0/mlflow-artifacts/

Conclusion

MLflow is an important platform in the modern MLOps ecosystem.

It helps teams manage experiments, models, artifacts, and machine learning workflows.

However, when MLflow is exposed to the internet without authentication, it can become a serious security risk.

The discovery of 25,000+ publicly accessible MLflow instances shows that AI infrastructure is now part of the attack surface.

This issue is not only about MLflow. It is a reminder that machine learning infrastructure must be protected with the same seriousness as production applications.

Security controls exist, but they must be enabled.

In the age of AI, protecting the systems used to build models is just as important as protecting the models themselves.

References

MLflow Documentation: https://mlflow.org/docs/latest/
MLflow Authentication: https://mlflow.org/docs/latest/self-hosting/security/basic-http-auth/
MLflow Tracking Server: https://mlflow.org/docs/latest/self-hosting/architecture/tracking-server/
MLflow REST API: https://mlflow.org/docs/latest/api_reference/rest-api.html
Shodan Filters: https://www.shodan.io/search/filters
FOFA: https://en.fofa.info/
CVSS v3.1: https://www.first.org/cvss/v3-1/specification-document

Thousands of Publicly Exposed MLflow Instances — A Hidden Risk in MLOps Infrastructure