Skip to content

C++: Support access paths for flow sources#22113

Draft
MathiasVP wants to merge 5 commits into
github:mainfrom
MathiasVP:cpp-access-paths-for-sources-and-sinks
Draft

C++: Support access paths for flow sources#22113
MathiasVP wants to merge 5 commits into
github:mainfrom
MathiasVP:cpp-access-paths-for-sources-and-sinks

Conversation

@MathiasVP

@MathiasVP MathiasVP commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

This PR adds the ability to specify sources with access paths in MaD. I've shamelessly stolen some of the code from #18298, but there are some problems that I can't seem to find a solution to. The code in 717a846 is (to the best of my knowledge) a clean translation from Rust to C++. However, the .expected files shows some issues that I've tried to solve in 15f09e8, but I really don't like this "solution". I'm not sure if this ever comes up in Rust.

@hvitved would you mind taking a look at this draft and help me address these two (probably related) problems I'm facing:

  1. The first problem happens when I change sourceNode from:

    predicate sourceNode(DataFlow::Node node, string kind, string model) {
      exists(SourceSinkInterpretationInput::InterpretNode n |
        isSourceNode(n, kind, model) and n.asNode() = node
      )
    }

    to:

    predicate sourceNode(DataFlow::Node node, string kind, string model) {
      node.(Nodes::FlowSummaryNode).isSource(kind, model)
    }

    this has the effect of changing all the MaD sources that specify output arguments as sources from the post-update node of the argument (i.e., ReadFile output argument) to call to ReadFile (which is then followed by a FlowSummaryImpl::Private::Steps::sourceLocalStep step to the output argument). However, this makes flow paths look really odd since the source isn't the return value of the function. You can see this happening in c2f7bf1.
    To work around this in 15f09e8 I hacked the one extension of SourceElement to only be satisfied for models for which we have a non-trivial output column. But, as I said, I really don't like this approach.

  2. The second problem (which may be related to the first one) can be seen from the missing result in c2f7bf1. This happens because I chose SourceBase to be Calls (or rather: the only extension I provided were calls). However, if there are no calls to the function in a database, but we still want to specify that a parameter of the uncalled function is a source then we don't have any Call that serves as the flow source entry. So this MaD flow source is no longer picked up as a flow source.

@hvitved I'd love to know if these problems also exist in Rust and whether I've missed some subtle thing about how Rust does things to make the above work.

@github-actions github-actions Bot added the C++ label Jul 2, 2026
source.getFunction() = callable.asSourceCallable()
)
or
exists(Position pos |
@MathiasVP MathiasVP requested a review from hvitved July 2, 2026 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants