Files
gcc-reflection/gcc/diagnostics/digraphs.h
David Malcolm e20eee3897 diagnostics: add optional CFG dumps to SARIF/HTML output sinks
This patch adds a new key/value pair "cfgs={yes,no}" to diagnostics
sinks, "no" by default.

If set to "yes" for a SARIF sink, then GCC will add the internal state
of the CFG for all functions after each pertinent optimization pass in
graph form to theRun.graphs in the SARIF output.

If set to "yes" for an HTML sink, the generated HTML will contain SVG
displaying the graphs, adapted from code in graph.cc

Text sinks ignore it.

The SARIF output is thus a machine-readable serialization of (some of)
GCC's intermediate representation (as JSON), but it's much less than
GCC-XML used to provide.  The precise form of the information is
documented as subject to change without notice.

Currently it shows both gimple statements and RTL instructions,
depending on the pass.  My hope is that it should be possible to write a
"cfg-grep" tool that can read the SARIF and automatically identify
in which pass a particular piece of our IR appeared or disappeared,
for tracking down bugs in our optimization passes.

Implementation-wise:
* this uses the publish-subscribe mechanism from the earlier patch, by
having the diagnostics sink subscribe to pass_events::after_pass
messages from the pass_events_channel.
* the patch adds a new hook to cfghooks.h for dumping a basic block
into a SARIF property bag

gcc/ChangeLog:
	* Makefile.in (OBJS): Add tree-diagnostic-cfg.o.
	(OBJS-libcommon): Add custom-sarif-properties/cfg.o,
	diagnostics/digraphs-to-dot.o, and
	diagnostics/digraphs-to-dot-from-cfg.o.
	* cfghooks.cc: Define INCLUDE_VECTOR.  Add includes of
	"diagnostics/sarif-sink.h" and "custom-sarif-properties/cfg.h".
	(dump_bb_as_sarif_properties): New.
	* cfghooks.h (diagnostics::sarif_builder): New forward decl.
	(json::object): New forward decl.
	(cfg_hooks::dump_bb_as_sarif_properties): New callback field.
	(dump_bb_as_sarif_properties): New decl.
	* cfgrtl.cc (rtl_cfg_hooks): Populate the new callback
	field with rtl_dump_bb_as_sarif_properties.
	(cfg_layout_rtl_cfg_hooks): Likewise.
	* custom-sarif-properties/cfg.cc: New file.
	* custom-sarif-properties/cfg.h: New file.
	* diagnostics/digraphs-to-dot-from-cfg.cc: New file, partly
	adapted from gcc/graph.cc.
	* diagnostics/digraphs-to-dot.cc: New file.
	* diagnostics/digraphs-to-dot.h: New file, based on material in...
	* diagnostics/digraphs.cc: Include
	"diagnostics/digraphs-to-dot.h".
	(class conversion_to_dot): Rework and move to above.
	(make_dot_graph_from_diagnostic_graph): Likewise.
	(make_dot_node_from_digraph_node): Likewise.
	(make_dot_edge_from_digraph_edge): Likewise.
	(conversion_to_dot::get_dot_id_for_node): Likewise.
	(conversion_to_dot::has_edges_p): Likewise.
	(digraph::make_dot_graph): Use to_dot::converter::make and invoke
	the result to make the dot graph.
	* diagnostics/digraphs.h (digraph:get_all_nodes): New accessor.
	* diagnostics/html-sink.cc
	(html_builder::m_per_logical_loc_graphs): New field.
	(html_builder::add_graph_for_logical_loc): New.
	(html_sink::report_digraph_for_logical_location): New.
	* diagnostics/sarif-sink.cc (sarif_array_of_unique::get_element):
	New.
	(sarif_builder::report_digraph_for_logical_location): New.
	(sarif_sink::report_digraph_for_logical_location): New.
	* diagnostics/sink.h: Include "diagnostics/logical-locations.h".
	(sink::report_digraph_for_logical_location): New vfunc.
	* diagnostics/text-sink.h
	(text_sink::report_digraph_for_logical_location): New.
	* doc/invoke.texi (fdiagnostics-add-output): Clarify wording.
	Distinguish between scheme-specific vs GCC-specific keys, and add
	"cfgs" as the first example of the latter.
	* gimple-pretty-print.cc: Include "cfghooks.h", "json.h", and
	"custom-sarif-properties/cfg.h".
	(gimple_dump_bb_as_sarif_properties): New.
	* gimple-pretty-print.h (diagnostics::sarif_builder): New forward
	decl.
	(json::object): Likewise.
	(gimple_dump_bb_as_sarif_properties): New.
	* graphviz.cc (get_compass_pt_from_string): New
	* graphviz.h (get_compass_pt_from_string): New decl.
	* libsarifreplay.cc (sarif_replayer::handle_graph_object): Fix
	overlong line.
	* opts-common.cc: Define INCLUDE_VECTOR.
	* opts-diagnostic.cc: Define INCLUDE_LIST.  Include
	"diagnostics/sarif-sink.h", "tree-diagnostic-sink-extensions.h",
	"opts-diagnostic.h", and "pub-sub.h".
	(class gcc_extra_keys): New class.
	(opt_spec_context::opt_spec_context): Add "client_keys" param and
	pass to dc_spec_context.
	(handle_gcc_specific_keys): New.
	(try_to_make_sink): New.
	(gcc_extension_factory::singleton): New.
	(handle_OPT_fdiagnostics_add_output_): Rework to use
	try_to_make_sink.
	(handle_OPT_fdiagnostics_set_output_): Likewise.
	* opts-diagnostic.h: Include "diagnostics/sink.h".
	(class gcc_extension_factory): New.
	* opts.cc: Define INCLUDE_LIST.
	* print-rtl.cc: Include "dumpfile.h", "cfghooks.h", "json.h", and
	"custom-sarif-properties/cfg.h".
	(rtl_dump_bb_as_sarif_properties): New.
	* print-rtl.h (diagnostics::sarif_builder): New forward decl.
	(json::object): Likewise.
	(rtl_dump_bb_as_sarif_properties): New decl.
	* tree-cfg.cc (gimple_cfg_hooks): Use
	gimple_dump_bb_as_sarif_properties for new callback field.
	* tree-diagnostic-cfg.cc: New file, based on material in graph.cc.
	* tree-diagnostic-sink-extensions.h: New file.
	* tree-diagnostic.cc: Define INCLUDE_LIST.  Include
	"tree-diagnostic-sink-extensions.h".
	(compiler_ext_factory): New.
	(tree_diagnostics_defaults): Set gcc_extension_factory::singleton
	to be compiler_ext_factory.

gcc/testsuite/ChangeLog:
	* gcc.dg/diagnostic-cfgs-html.py: New test.
	* gcc.dg/diagnostic-cfgs-sarif.py: New test.
	* gcc.dg/diagnostic-cfgs.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2026-01-09 15:54:15 -05:00

424 lines
9.1 KiB
C++

/* Directed graphs associated with a diagnostic.
Copyright (C) 2025-2026 Free Software Foundation, Inc.
Contributed by David Malcolm <dmalcolm@redhat.com>
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 3, or (at your option) any later
version.
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3. If not see
<http://www.gnu.org/licenses/>. */
#ifndef GCC_DIAGNOSTICS_DIGRAPHS_H
#define GCC_DIAGNOSTICS_DIGRAPHS_H
#include "json.h"
#include "tristate.h"
#include "diagnostics/logical-locations.h"
class graphviz_out;
class sarif_graph;
class sarif_node;
class sarif_edge;
namespace dot { class graph; }
namespace diagnostics {
namespace digraphs {
/* A family of classes: digraph, node, and edge, closely related to
SARIF's graph, node, and edge types (SARIF v2.1.0 sections 3.39-3.41).
Nodes can have child nodes, allowing for arbitrarily deep nesting.
Edges can be between any pair of nodes (potentially at different
nesting levels).
Digraphs, nodes, and edges also optionally have a JSON property bag,
allowing round-tripping of arbitrary key/value pairs through SARIF. */
class digraph;
class node;
class edge;
/* A base class for digraph, node, and edge to allow them to have
an optional JSON property bag. */
class object
{
public:
/* String properties. */
const char *get_property (const json::string_property &property) const;
void set_property (const json::string_property &property,
const char *utf8_value);
/* Integer properties. */
bool maybe_get_property (const json::integer_property &property, long &out) const;
void set_property (const json::integer_property &property, long value);
/* Bool properties. */
tristate
get_property_as_tristate (const json::bool_property &property) const;
void set_property (const json::bool_property &property, bool value);
/* Array-of-string properties. */
json::array *
get_property (const json::array_of_string_property &property) const;
/* enum properties. */
template <typename EnumType>
EnumType
get_property (const json::enum_property<EnumType> &property) const
{
if (m_property_bag)
{
EnumType result;
if (m_property_bag->maybe_get_enum<EnumType> (property, result))
return result;
}
return json::enum_traits<EnumType>::get_unknown_value ();
}
template <typename EnumType>
void
set_property (const json::enum_property<EnumType> &property,
EnumType value)
{
auto &bag = ensure_property_bag ();
bag.set_enum<EnumType> (property, value);
}
/* json::value properties. */
const json::value *get_property (const json::json_property &property) const;
void set_property (const json::json_property &property,
std::unique_ptr<json::value> value);
json::object *
get_property_bag () const { return m_property_bag.get (); }
json::object &
ensure_property_bag ();
void
set_property_bag (std::unique_ptr<json::object> property_bag)
{
m_property_bag = std::move (property_bag);
}
private:
std::unique_ptr<json::object> m_property_bag;
};
// A directed graph, corresponding to SARIF v2.1.0 section 3.39.
class digraph : public object
{
public:
friend class node;
friend class edge;
digraph () : m_next_edge_id_index (0) {}
virtual ~digraph () {}
const char *
get_description () const
{
if (!m_description)
return nullptr;
return m_description->c_str ();
}
void
set_description (const char *desc)
{
if (desc)
m_description = std::make_unique<std::string> (desc);
else
m_description = nullptr;
}
void
set_description (std::string desc)
{
m_description = std::make_unique<std::string> (std::move (desc));
}
node *
get_node_by_id (const char *id) const
{
auto iter = m_id_to_node_map.find (id);
if (iter == m_id_to_node_map.end ())
return nullptr;
return iter->second;
}
edge *
get_edge_by_id (const char *id) const
{
auto iter = m_id_to_edge_map.find (id);
if (iter == m_id_to_edge_map.end ())
return nullptr;
return iter->second;
}
size_t
get_num_nodes () const
{
return m_nodes.size ();
}
node &
get_node (size_t idx) const
{
return *m_nodes[idx].get ();
}
size_t
get_num_edges () const
{
return m_edges.size ();
}
edge &
get_edge (size_t idx) const
{
return *m_edges[idx].get ();
}
void
dump () const;
std::unique_ptr<json::object>
make_json_sarif_graph () const;
std::unique_ptr<dot::graph>
make_dot_graph () const;
void
add_node (std::unique_ptr<node> n)
{
gcc_assert (n);
m_nodes.push_back (std::move (n));
}
void
add_edge (std::unique_ptr<edge> e)
{
gcc_assert (e);
m_edges.push_back (std::move (e));
}
void
add_edge (const char *id,
node &src_node,
node &dst_node,
const char *label = nullptr);
std::unique_ptr<digraph> clone () const;
const char *get_graph_kind () const;
void set_graph_kind (const char *);
const std::map<std::string, node *> &
get_all_nodes () const
{
return m_id_to_node_map;
}
private:
void
add_node_id (std::string node_id, node &new_node)
{
m_id_to_node_map.insert ({std::move (node_id), &new_node});
}
void
add_edge_id (std::string edge_id, edge &new_edge)
{
m_id_to_edge_map.insert ({std::move (edge_id), &new_edge});
}
std::string
make_edge_id (const char *edge_id);
std::unique_ptr<std::string> m_description;
std::map<std::string, node *> m_id_to_node_map;
std::map<std::string, edge *> m_id_to_edge_map;
std::vector<std::unique_ptr<node>> m_nodes;
std::vector<std::unique_ptr<edge>> m_edges;
size_t m_next_edge_id_index;
};
// A node in a directed graph, corresponding to SARIF v2.1.0 section 3.40.
class node : public object
{
public:
virtual ~node () {}
node (digraph &g, std::string id)
: m_id (id),
m_physical_loc (UNKNOWN_LOCATION)
{
g.add_node_id (std::move (id), *this);
}
node (const node &) = delete;
std::string
get_id () const { return m_id; }
const char *
get_label () const
{
if (!m_label)
return nullptr;
return m_label->c_str ();
}
void
set_label (const char *label)
{
if (label)
m_label = std::make_unique<std::string> (label);
else
m_label = nullptr;
}
void
set_label (std::string label)
{
m_label = std::make_unique<std::string> (std::move (label));
}
size_t
get_num_children () const { return m_children.size (); }
node &
get_child (size_t idx) const { return *m_children[idx].get (); }
void
add_child (std::unique_ptr<node> child)
{
gcc_assert (child);
m_children.push_back (std::move (child));
}
location_t
get_physical_loc () const
{
return m_physical_loc;
}
void
set_physical_loc (location_t physical_loc)
{
m_physical_loc = physical_loc;
}
logical_locations::key
get_logical_loc () const
{
return m_logical_loc;
}
void
set_logical_loc (logical_locations::key logical_loc)
{
m_logical_loc = logical_loc;
}
void print (graphviz_out &gv) const;
void
dump () const;
std::unique_ptr<json::object>
to_json_sarif_node () const;
std::unique_ptr<node>
clone (digraph &new_graph,
std::map<node *, node *> &node_mapping) const;
private:
std::string m_id;
std::unique_ptr<std::string> m_label;
std::vector<std::unique_ptr<node>> m_children;
location_t m_physical_loc;
logical_locations::key m_logical_loc;
};
// An edge in a directed graph, corresponding to SARIF v2.1.0 section 3.41.
class edge : public object
{
public:
virtual ~edge () {}
/* SARIF requires us to provide unique edge IDs within a graph,
but otherwise we don't need them.
Pass in nullptr for the id to get the graph to generate a unique
edge id for us. */
edge (digraph &g,
const char *id,
node &src_node,
node &dst_node)
: m_id (g.make_edge_id (id)),
m_src_node (src_node),
m_dst_node (dst_node)
{
g.add_edge_id (m_id, *this);
}
std::string
get_id () const { return m_id; }
const char *
get_label () const
{
if (!m_label)
return nullptr;
return m_label->c_str ();
}
void
set_label (const char *label)
{
if (label)
m_label = std::make_unique<std::string> (label);
else
m_label = nullptr;
}
node &
get_src_node () const { return m_src_node; }
node &
get_dst_node () const { return m_dst_node; }
void
dump () const;
std::unique_ptr<json::object>
to_json_sarif_edge () const;
std::unique_ptr<edge>
clone (digraph &new_graph,
const std::map<diagnostics::digraphs::node *, diagnostics::digraphs::node *> &node_mapping) const;
private:
std::string m_id;
std::unique_ptr<std::string> m_label;
node &m_src_node;
node &m_dst_node;
};
} // namespace digraphs
} // namespace diagnostics
#endif /* ! GCC_DIAGNOSTICS_DIGRAPHS_H */