Graph Construction Based on Local Representativeness

  • Conference paper
  • First Online: 01 July 2017
  • Cite this conference paper

graph construction analysis and problem solving

  • Eliska Ochodkova 15 ,
  • Sarka Zehnalova 15 &
  • Milos Kudelka 15  

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10392))

Included in the following conference series:

  • International Computing and Combinatorics Conference

1775 Accesses

13 Citations

Graph construction is a known method of transferring the problem of classic vector data mining to network analysis. The advantage of networks is that the data are extended by links between certain (similar) pairs of data objects, so relationships in the data can then be visualized in a natural way. In this area, there are many algorithms, often with significantly different results. A common problem for all algorithms is to find relationships in data so as to preserve the characteristics related to the internal structure of the data. We present a method of graph construction based on a network reduction algorithm, which is found on analysis of the representativeness of the nodes of the network. It was verified experimentally that this algorithm preserves structural characteristics of the network during the reduction. This approach serves as the basis for our method which does not require any default parameters. In our experiments, we show the comparison of our graph construction method with one well-known method based on the most commonly used approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

graph construction analysis and problem solving

Graph Clustering by Maximizing Statistical Association Measures

graph construction analysis and problem solving

Distances on a Graph

graph construction analysis and problem solving

Structural Analysis of Networks

Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res. 7 , 2399–2434 (2006)

MathSciNet   MATH   Google Scholar  

Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008 (10), P10008 (2008)

Article   Google Scholar  

Charytanowicz, M., Niewczas, J., Kulczycki, P., Kowalski, P.A., Łukasik, S., Żak, S.: Complete gradient clustering algorithm for features analysis of X-ray images. In: Piȩtka, E., Kawa, J. (eds.) Information Technologies in Biomedicine. Advances in Intelligent and Soft Computing, vol. 69, pp. 15–24. Springer, Heidelberg (2010)

Chapter   Google Scholar  

Daitch, S.I., Kelner, J.A., Spielman, D.A.: Fitting a graph to vector data. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 201–208. ACM (2009)

Google Scholar  

Derényi, I., Palla, G., Vicsek, T.: Clique percolation in random networks. Phys. Rev. Lett. 94 (16), 160202 (2005)

Dhillon, P.S., Talukdar, P.P., Crammer, K.: Inference-driven metric learning for graph construction. In: 4th North East Student Colloquium on Artificial Intelligence (2010)

Dornaika, F., Bosaghzadeh, A.: Adaptive graph construction using data self-representativeness for pattern classification. Inf. Sci. 325 , 118–139 (2015)

Article   MathSciNet   Google Scholar  

Higuera, C., Gardiner, K.J., Cios, K.J.: Self-organizing feature maps identify proteins critical to learning in a mouse model of down syndrome. PLoS one 10 (6), e0129126 (2015)

Huttenhower, C., Flamholz, A.I., Landis, J.N., Sahi, S., Myers, C.L., Olszewski, K.L., Hibbs, M.A., Siemers, N.O., Troyanskaya, O.G., Coller, H.A.: Nearest neighbor networks: clustering expression data based on gene neighborhoods. BMC Bioinform. 8 (1), 250 (2007)

Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml

Liu, W., Chang, S.F.: Robust multi-class transductive learning with graphs. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 381–388. IEEE (2009)

Maier, M., Von Luxburg, U., Hein, M.: Influence of graph construction on graph-based clustering measures. In: NIPS, pp. 1025–1032 (2008)

Newman, M.E.: Assortative mixing in networks. Phys. Rev. Lett. 89 (20), 208701 (2002)

Subramanya, A., Talukdar, P.P.: Graph-based semi-supervised learning. In: Synthesis Lectures on Artificial Intelligence and Machine Learning, vol. 8(4), pp. 1–125 (2014)

Wang, F., Zhang, C.: Label propagation through linear neighborhoods. IEEE Trans. Knowl. Data Eng. 20 (1), 55–67 (2008)

Yan, S., Wang, H.: Semi-supervised learning by sparse representation. In: Proceedings of the 2009 SIAM International Conference on Data Mining, SIAM, pp. 792–801 (2009)

Zehnalova, S., Kudelka, M., Platos, J.: Local representativeness in vector data. In: 2014 IEEE International Conference on Systems, Man and Cybernetics (SMC), pp. 894–899. IEEE (2014)

Zehnalova, S., Kudelka, M., Platos, J., Horak, Z.: Local representatives in weighted networks. In: 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 870–875. IEEE (2014)

Download references

Acknowledgments

This work was supported by grant of Ministry of Health of Czech Republic (MZ CR VES16-31852A) and by SGS, VSB-Technical University of Ostrava, under the grant no. SP2017/85.

Author information

Authors and affiliations.

FEI, VSB, Technical University of Ostrava, 17. Listopadu 15, 708 33, Ostrava, Czech Republic

Eliska Ochodkova, Sarka Zehnalova & Milos Kudelka

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Milos Kudelka .

Editor information

Editors and affiliations.

Department of Computing, Hong Kong Polytechnic University, Hong Kong, China

Texas A&M University, College Station, Texas, USA

Jianer Chen

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper.

Ochodkova, E., Zehnalova, S., Kudelka, M. (2017). Graph Construction Based on Local Representativeness. In: Cao, Y., Chen, J. (eds) Computing and Combinatorics. COCOON 2017. Lecture Notes in Computer Science(), vol 10392. Springer, Cham. https://doi.org/10.1007/978-3-319-62389-4_54

Download citation

DOI : https://doi.org/10.1007/978-3-319-62389-4_54

Published : 01 July 2017

Publisher Name : Springer, Cham

Print ISBN : 978-3-319-62388-7

Online ISBN : 978-3-319-62389-4

eBook Packages : Computer Science Computer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research
  • DOI: 10.1080/00207543.2022.2078748
  • Corpus ID: 249144487

Data-driven causal knowledge graph construction for root cause analysis in quality problem solving

  • Zhaoguang Xu , Yanzhong Dang
  • Published in International Journal of… 27 May 2022
  • Engineering, Computer Science

15 Citations

Constructing and interpreting causal knowledge graphs from news, graph-enabled cognitive digital twins for causal inference in maintenance processes, hybrid intelligence failure analysis for industry 4.0: a literature review and future prospective, weight-based and digital payment process flow design to decrease quarry sand sales fraudulence risk, manufacturing process optimization in the process industry, multi-objective economic-statistical design of simple linear profiles using a combination of nsga-ii, rsm, and topsis, interpretability of causal discovery in tracking deterioration in a highly dynamic process, esgnet: a multimodal network model incorporating entity semantic graphs for information extraction from chinese resumes, causalkgpt: industrial structure causal knowledge-enhanced large language model for cause analysis of quality problems in aerospace product manufacturing, ids-kg: an industrial dataspace-based knowledge graph construction approach for smart maintenance, 88 references, automated digital cause-and-effect diagrams to assist causal analysis in problem-solving: a data-driven approach, a survey on extraction of causal relations from natural language text, event causality extraction based on connectives analysis, a causality mining and knowledge graph based method of root cause diagnosis for performance anomaly in cloud applications, causal network construction to support understanding of news, incremental construction of causal network from news articles, knowledge-driven intelligent quality problem-solving system in the automotive industry, extracting explicit and implicit causal relations from sparse, domain-specific texts, on the influence of overlap in automatic root cause analysis in manufacturing, a multi-method approach to building causal performance maps from expert knowledge, related papers.

Showing 1 through 3 of 0 Related Papers

  • Docs »
  • Graph problems
  • Edit on GitHub

Graph problems ¶

Adapt everything: figures, maths, …

In this chapter we will present models for three optimization problems with a combinatorial structure (graph partitioning problem, maximum stable set problem, graph coloring problem) and try to solve them with SCIP/Python. All the models dealt with here are based on the definition of a graph. A graph is an abstract concept, a construction derived from vertices and edges linking two vertices, but many of the practical optimization problem can be defined naturally by means of graphs.

The roadmap for this chapter is the following. Section gpp deals with the basic notions of graph theory and with the graph partitioning problem, describing a method for dealing with a quadratic objective function by linearizing it. Section ssp presents the maximum stable set problem. Section gcp describes the graph coloring problem, proposing an improved model for avoiding symmetry in the solution space.

Graph partitioning problem ¶

Consider the following scenario.

Six friends are deciding how to split for forming two teams of mini-soccer (Figure Graph partitioning problem ). Of course, in order to be fair, each team must have the same number of persons — three, in this case. However, having good friends in separate teams should be avoided as much as possible. So, how should the group of persons be divided into two teams?

_images/gpp.png

Graph partitioning problem

  • Graph where an edge between two persons indicates that they are good friends.
  • Solution where the number of good friends in (equally divided) different teams is minimum. Pairs of good friends belonging to different teams are represented by a thick line — there are two, in this case. Hence, the objective value for equal division is 2.

The case above is an example of a combinatorial optimization problem called the graph partitioning problem . Actually, rather than creating football teams, this NP-hard problem has a number of serious applications, including VLSI (very-large-scale integration) design.

This real problem is easy to understand using the concept of “graph”. A graph is an abstract object composed of vertices and edges ; an edge is a link between two vertices. Graphs are very useful tools to unambiguously represent many real problems. As an example, let us represent a friendship relationship with a graph.

You have six friends. First of all, represent each of these friends by a circle; in graph theory, these circles are called vertices (also called nodes or points ). As always in life, some of these fellows have a good relationship between them, whereas others have a bad relationship. In order to organize these complicated relashionships, you connect with a line each pair of your friends which are in good terms with each other. In graph theory, such a line is called an edge (also called arc or line ). When represented in this way, the friendship scenario becomes very easy to grasp. This representation is a graph.

More precisely, the graphs dealt with in this chapter are called undirected graphs , because the lines connecting two vertices have no implied direction.

The set of vertices, here representing the set of friends, is usually referred to as \(V\) . The set of edges, here representing friendship connections, is usually referred to as \(E\) . Since a graph is composed of a set of vertices and a set of edges, it is commonly denoted as \(G = (V,E)\) . Vertices which are endpoints of an edge are said to be adjacent to each other. Besides, an edge is said to be incident to the vertices at both ends. The number of edges connected to a vertex defines its degree .

The graph partitioning problem can be neatly described using this terminology.

Graph partitioning problem Given an undirected graph \(G = (V,E)\) with an even number of vertices \(n = |V|\) [1] , divide \(V\) into two subsets \(L\) and \(R\) with the same number of vertices ( uniform partition or equipartition ) satisfying \(L \cap R = \emptyset\) , \(L \cup R = V\) , \(|L| = |R| = n/2\) , so as to minimize the number of edges across \(L\) and \(R\) (more precisely, the number of edges \(\{i,j\}~\) such that either \(i \in L\) and \(j \in R\) , or \(i \in R\) and \(j \in L\) ).

In order to define the graph partitioning problem more precisely, we will formulate it as an integer optimization problem. Given an undirected graph \(G = (V,E)\) , the pair \((L,R)\) is a partition the set of vertices into two subsets \(L\) and \(R\) (i.e., a bipartition) if it satisfies \(L \cap R = \emptyset\) (no intersection) and \(L \cup R = V\) (the union is the whole set of vertices). Even though \(L\) stands for left and \(R\) for right, nothing changes if their roles are swapped; hence, \((L,R)\) is a non-ordered pair. Introducing binary variables \(x_i\) which will take the value 1 when vertex \(i\) is included in subset \(L\) , and the value 0 otherwise (i.e., \(i\) is included in subset \(R\) ), for having vertices equally divided the sum of \(x_i\) must be equal to \(n/2\) . When an edge \(\{i,j\}~\) is across \(L\) and \(R\) , either \(x_i (1-x_j)\) or \((1-x_i) x_j\) become 1, allowing us to write the following formulation.

Many of the available mathematical optimization solvers do not support minimization problem whose objective function is not convex (for the definition of convex function refer to Chapter piecewiselinear ). The above quadratic terms are not convex. Even though SCIP does provide support for these cases, it is much more efficient for solving linear problems. Therefore, in cases where we can find an equivalent linear formulation is it advisable to use it. We will see that for the graph partition problem this is possible.

Let binary variables \(y_{ij}\) model the case where edges are incident to different subsets, i.e., \(y_{ij} = 1\) if the endpoints of edge \(\{i,j\}~\) are across \(L\) and \(R\) , \(y_{ij} = 0\) otherwise. Variables \(x_i, i \in V\) have the same meaning as above. With these variables, the graph partitioning problem can be modeled with linear integer optimization as follows.

As the objective is to minimize the sum of variables \(y_{ij}\) , their value will be as much as possible zero, but constraints force some of them to be one. The first constraint defines an equal division of the set of vertices. The second constraint implies that if \(i \in L\) and \(j \not\in L\) (i.e., edge \(\{i,j\}~\) is across subsets \(L\) and \(R\) ), then \(y_{ij} = 1\) The third constraint implies that if \(j \in L\) and \(i \not\in L\) , then \(y_{ij} = 1\) .

A model for this in Python/SCIP can be written as follows:

The function gpp requires as parameters a set of vertices V and a set of edges E . An example of a function for generating such data randomly given below.

With these functions, the main program can be written as follows.

if __name__ == “__main__”: V,E = make_data(4,.5) model = gpp(V,E) model.optimize() print(“Optimal value:”, model.getObjVal())
The number of elements included in a set \(V\) is called the of the set, and is represented by \(|V|\).

Margin seminar 5

Mathematical optimization and constraint programming

Although the central paradigm used in this document for solving optimization problems is mathematical optimization (previously known as mathematical programming ), another framework for solving similar problems is constraint programming . These two technologies, more than competing, complement each other as powerful optimization tools. Depending on the problem it may be advisable to use tools from mathematical optimization, from constraint programming, or to combine the two technologies.

In mathematical optimization, variables must be defined as real or integer numbers. In constraint programming, variables typically take one value from a given discrete set, called the domain . Constraint programming is good at solving problems with a combinatorial structure; it is weak for handling continuous (real) variables, for which mathematical optimization is very powerful. On the other hand, problems containing non-convex expressions, such as the graph partitioning problem, can often be easily solved in constraint programming. In addition, it is also good for problems for which it is difficult to find a feasible solution, such as puzzles or the staff scheduling problem described in Section 9.3 .

SCIP is specialized in constraint integer optimization , combining techniques for constraint programming, mixed-integer optimization, and satisfiability problems.

Maximum stable set problem ¶

You are choosing, from a group of six friends, with whom to go for a picnic. However, persons linked with an edge in Figure Maximum stable set problem are on very unfriendly terms with each other, so if both of them go to the picnic, it will be spoiled. To have as many friends as possible in the picnic, who should be invited?

_images/ssp.png

Maximum stable set problem

  • Graph where an edge between two persons indicates that they are on unfriendly terms.
  • Maximum number of persons that can go to a picnic such that all the invitees are in good terms. The four persons encircled can all be at the picnic without spoiling it; this is the optimal solution.

This is an example of the so-called maximum stable set problem, a fundamental problem in graph theory. The maximum stable sets problem can be defined as follows.

Maximum stable set problem Given an undirected graph \(G = (V,E)\) , a subset \(S \subseteq V\) is called a stable set when there isn’t any edge among vertices of \(S\) . The problem is to find a stable set \(S\) such that its cardinality (i.e., \(|S|\) , the number of vertices it contains) is maximum.

Considering the complementary graph this problem—the complementary graph inverts the edges, i.e., contains edges only between pairs of vertices for which there is no edge in the original graph—the maximum clique problem is defined below. These two problems are equivalent, in the sense that they can be converted through a simple transformation, and the solution is the same (see Figure Maximum stable set and maximum clique ).

_images/mssp-mcp.png

Maximum stable set and maximum clique

  • Maximum stable set.
  • Maximum clique on the complementary graph.
Maximum clique problem Given an undirected graph \(G = (V,E)\) , a subset \(C \subseteq V\) is called a clique when the subgraph induced by \(C\) is complete (in a complete graph there is edge connecting all pairs of vertices; the subgraph induced by a subset of vertices contains all the edges of the original graph with both ends in that subset). The problem is to find a clique \(C\) which maximizes cardinality \(|C|\) .

These problems have applications in coding theory, reliability, genetics, archeology and VLSI design, among others. Using a variable for each vertex \(i\) , which take on the value 1 when vertex \(i\) is included in the stable set, this problem can be formulated as follows.

This formulation can be written as a Python/SCIP program in the following manner.

This function can be used similarly to the one described above for the graph partitioning problem.

Graph coloring problem ¶

You are concerned about how to assign a class to each of your friends. Those which are on unfriendly terms with each other are linked with an edge in Figure Graph coloring problem . If put on the same class, persons on unfriendly terms will start a fight. To divide your friends into as few classes as possible, how should you proceed?

_images/gcp.png

Graph coloring problem

  • Dividing into three classes keeps persons on unfriendly terms in different classes. The value of the objective function (the number of classes) being 3, this is an optimal solution.

This is an example of the classical optimization problem called graph coloring problem , which can be defined as follows.

Graph coloring problem Given an undirected graph \(G = (V,E)\) , a \(K-\) partition is a division of the vertices \(V\) into \(K\) subsets \(V_1, V_2, \ldots, V_K\) such that \(V_i \cap V_j = \emptyset, \forall i \neq j\) (there is no overlap), and \(\bigcup_{j=1}^{K} V_j = V\) (the union of subsets is the full set of vertices). Each \(V_i (i=1, 2, \ldots K)\) is called a color class . In a \(K-\) partition, if all the vertices in a color class \(V_i\) form a stable set (i.e., there is no edge among two vertices in that class), it is called \(K-\) coloring .

For a given undirected graph, the graph coloring problem consists of finding the minimum \(K\) for which there is a \(K-\) coloring; this is called the graph’s chromatic number .

The graph coloring problem has a variety of applications, such as timetabling and frequency allocation.

For writing a mathematical formulation for the graph coloring problem, an upper bound \(K_{\text{max}}\) of the number of colors is required. In other words, the optimal number of colors \(K\) determined as an integer \(1 \leq K \leq K_{\text{max}}\) .

Let us define binary variables \(x_{ik}\) such that when a vertex \(i\) is assigned a color \(k\) , \(x_{ik}\) takes the value 1; otherwise, \(x_{ik}\) takes the value 0. Besides, binary variable \(y_{k}=1\) indicates that color \(k\) has been used, i.e., set \(V_i\) contains at least one vertex; otherwise, \(V_i\) is empty and \(y_{k}=0\) , indicating that color \(k\) was not required.

The first constraint in this formulation indicates that exactly one color is assigned to each vertex. The second constraint connects variables \(x\) and \(y\) , allowing coloring with color \(k\) only if \(y_k=1\) , and forbids the endpoits of any edge \(\{i,j\}~\) , vertices \(i\) and \(j\) , from having the same color simultaneously.

Many of the mathematical optimization solvers, including SCIP, use the branch-and-bound method (see Margin Seminar branch-and-bound ). Since all color classes in the formulation above are treated indifferently, the solution space has a great deal of symmetry. Symmetry causes troubles to branch-and-bound, increasing enormously the size of the tree that needs to be explored. For example, the solutions \(V_1 = {1,2,3}, V2 = {4,5}\) and \(V_1 = {4,5}, V2 = {1,2,3}\) are equivalent, but are represented by different vectors \(x\) and \(y\) . In this case, there occurs a phenomenon where branching on any of the variables \(x,y\) leads to no improvements in the lower bound. When solving the graph coloring problem with a mathematical optimization solver, to avoid some symmetry in the solution space, it is recommended to add the following constraints.

Adding the above constraint forces to use preferentially color classes with low subscripts. Simply adding this constraint may considerably improve the solving time.

Modeling tip 5

When there is symmetry in a formulation, add constraints for removing it.

When formulations for integer optimization problems have a large amount of symmetry, the branch-and-bound method is weak. In such a case, by adding constraints for explicitly breaking symmetry in the formulation, the solving time may be dramatically improved. However, deciding what constraints should be added is still a matter of craftsmanship, there are no uniform guidelines. In the authors’ experience, adding simple constraints using the 0-1 variables such as those added in the graph coloring problem often works well. However, in some cases adding elaborate constraints will break the structure of the problem, and in these cases the solver is likely to become slower; hence, one often needs careful experimentation for deciding if such constraints are useful.

A program in Python/SCIP implementing a formulation for the graph coloring problem, including the a constraint for removing symmetry, is as follows.

In some cases, by adding SOS (special ordered set) constraints this formulation can be improved.

Modeling tip 6

When in a group of binary variables only one (or two consecutive) takes a positive value, use special ordered sets.

A special ordered set (SOS) is a constraint applied to a set of variables. There are SOS constraints of types 1 and 2. For special ordered set constraints of type 1, at most one variable in the set may take non-zero values. For special ordered sets of type 2, at most two consecutive variables (in the specified order) may be non-zero.

In the graph coloring problem, since each vertex may colored in any color, we may declare a special ordered set of type 1 for each vertex, meaning that it takes a value, but at most one may be non-zero.

Especially when the solutions contain symmetry, providing information concerning these special ordered sets often improves efficiency during the search for a solution. (Even though the improvements are not guaranteed, it is worth trying.) In addition, special ordered sets of type 2 play an effective role in the approximation of nonlinear functions by piecewise linear functions. This is described in Section 8.2.1

In the approach shown above, it was intended to minimize the number of colors used, and thus determine the chromatic number \(K\) . Let us now turn to a different approach which will allow us to solve larger instances, where the number of colors used is fixed.

If number of colors to be used is fixed and limited, there is no guarantee that we can assign a different color to each endpoint of all edges in the graph. Let a new variable \(z_{ij}\) be 1 if the endpoints of edge \(\{i,j\}~\) have been assigned the same color (i.e., \(\{i,j\}~\) is a bad edge ), 0 otherwise. The objective is now to minimize the number of bad edges; if the optimum is 0, it means that the colors assigned are feasible, and hence that the number of colors used is an upper bound to the chromatic number \(K\) . On the other hand, if there are bad edges in the optimum, then the value that had been fixed for the number of colors is less than the chromatic number.

Here, the objective is to minimize the number of bad edges. The first constraint indicates that the exactly one color is assigned to each vertex. The second constraint determines that edges \(\{i,j\}~\) whose endpoints \(i\) and \(j\) are assigned the same color class are bad edges (i.e., \(z_{ij}\) = 1).

Follows a program in Python/SCIP implementing this formulation for the graph coloring problem.

The optimum \(K\) (i.e., the smallest value such that the optimum of the above problem is 0) may be determined through binary search. Given an upper and a lower bound to the chromatic number (e.g., the number of vertices \(n\) and 1, respectively), the binary search algorithm can be written as follows.

LATEX/binsearchGCP.png

Binary search method for solving the graph coloring problem.

Next we present code in Python for the same purpose.

The approach for solving the graph coloring problem using binary search and a formulation with fixed \(K\) can solve larger problems that the initial, standalone formulation.

ACM Digital Library home

  • Advanced Search

A graph-constructive approach to solving systems of geometric constraints

  • 176 citation

New Citation Alert added!

This alert has been successfully added and will be sent to:

You will be notified whenever a record that you have chosen has been cited.

To manage your alert preferences, click on the button below.

New Citation Alert!

Please log in to your account

Information & Contributors

Bibliometrics & citations, view options.

  • Vergez L Polette A Pernot J (2024) Interface-Based Search and Automatic Reassembly of CAD Models for Database Expansion and Model Reuse Computer-Aided Design 10.1016/j.cad.2023.103630 167 (103630) Online publication date: Feb-2024 https://doi.org/10.1016/j.cad.2023.103630
  • Farouzi A Zhou X Bellatreche L Malki M Ordonez C (2024) Balanced parallel triangle enumeration with an adaptive algorithm Distributed and Parallel Databases 10.1007/s10619-023-07437-x 42 :1 (103-141) Online publication date: 1-Mar-2024 https://dl.acm.org/doi/10.1007/s10619-023-07437-x
  • Lyu Z Purwar A Liao W (2023) A Unified Real-Time Motion Generation Algorithm for Approximate Position Analysis of Planar N-Bar Mechanisms Journal of Mechanical Design 10.1115/1.4064132 146 :6 Online publication date: 15-Dec-2023 https://doi.org/10.1115/1.4064132
  • Show More Cited By

Index Terms

Applied computing

Arts and humanities

Architecture (buildings)

Computer-aided design

Physical sciences and engineering

Engineering

Computing methodologies

Computer graphics

Shape modeling

Mathematics of computing

Discrete mathematics

Graph theory

Graph algorithms

Theory of computation

Randomness, geometry and discrete structures

Recommendations

Transforming an under-constrained geometric constraint problem into a well-constrained one.

We present an approach for handling geometric constraint problems with under-constrained configurations. The approach works by completing the given set of constraints with constraints that can be defined either automatically or drawn from an ...

Solving Geometric Constraints by a Graph-Constructive Approach

A geometric constraint solver is a major component of recent CAD systems. Graph constructive solvers are stemming from graph theory. In this paper, we describe a 2D constraint-based modeller that uses a graph constructive approach to solve systems of ...

Revisiting decomposition analysis of geometric constraint graphs

Geometric problems defined by constraints can be represented by geometric constraint graphs whose nodes are geometric elements and whose arcs represent geometric constraints. Reduction and decomposition are techniques commonly used to analyze geometric ...

Reviewer: John B. Slater

The problem of accurately reconstructing a two-dimensional geometric configuration of lines and points, or the equivalent in generic terms, given a set of angles and distances relating them, is important for many applications. One promising approach is a graph-theoretic approach, in which the points and lines form the vertices of a graph and the edges represent constraints such as an incidence, angle, or distance. Problems arise when accurate but redundant information is given (the graph is consistently overconstrained) or when not enough information is given (the graph is underconstrained). There are also significant problems when alternate solutions arise due to orientation problems, such as possible reflections of subsets of the configuration. This research paper discusses the problem elegantly, carefully, and with an appropriate degree of formality, giving a number of useful illustrations. It details a quadratic-time algorithm for detecting graph-theoretically whether a problem is well constrained, overconstrained, consistently overconstrained, or underconstrained, and it indicates how to handle the problems of alternative solutions from rough sketches. The authors describe a conceptual algorithm for the actual construction. In some cases this algorithm can be very efficient, depending on the subgraphs, but the authors produce examples that demonstrate that the general problem is NP-hard. While the proofs can be long, they are well presented and aptly illustrated. A good set of appropriate references is provided. It seems that the results are waiting to be built on and that more research is required to identify those problems for which the resulting algorithm is efficient. Nevertheless, the result is a step forward in this field, and the paper explains many of the underlying concepts effectively.

Computing Reviews logo

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Information

Published in.

cover image ACM Transactions on Graphics

Association for Computing Machinery

New York, NY, United States

Publication History

Permissions, check for updates, author tags.

  • constraint solving
  • geometric constraints
  • graph-based constraint solvers
  • underconstrained systems

Contributors

Other metrics, bibliometrics, article metrics.

  • 176 Total Citations View Citations
  • 1,300 Total Downloads
  • Downloads (Last 12 months) 167
  • Downloads (Last 6 weeks) 17
  • Zou Q Feng H Gao S (2023) Variational Direct Modeling Computer-Aided Design 10.1016/j.cad.2022.103465 157 :C Online publication date: 1-Apr-2023 https://dl.acm.org/doi/10.1016/j.cad.2022.103465
  • Tang Z Zou Q Gao S (2023) A decision-support method for multi-parameter editing of parametric CAD models Advanced Engineering Informatics 10.1016/j.aei.2023.101997 56 (101997) Online publication date: Apr-2023 https://doi.org/10.1016/j.aei.2023.101997
  • Basañez L Nuño E Aldana C (2022) Teleoperation and Level of Automation Springer Handbook of Automation 10.1007/978-3-030-96729-1_20 (457-482) Online publication date: 24-Feb-2022 https://doi.org/10.1007/978-3-030-96729-1_20
  • Pandurangan G Robinson P Scquizzato M (2021) On the Distributed Complexity of Large-Scale Graph Computations ACM Transactions on Parallel Computing 10.1145/3460900 8 :2 (1-28) Online publication date: 15-Jul-2021 https://dl.acm.org/doi/10.1145/3460900
  • Hu H Kleiner M Pernot J Zhang C Huang Y Zhao Q Yeung S (2021) Geometric Over-Constraints Detection: A Survey Archives of Computational Methods in Engineering 10.1007/s11831-020-09509-y Online publication date: 11-May-2021 https://doi.org/10.1007/s11831-020-09509-y
  • Sadjadi M Hagh V Kang M Sitharam M Connelly R Gortler S Theran L Holmes-Cerfon M Thorpe M (2021) Realizations of Isostatic Material Frameworks physica status solidi (b) 10.1002/pssb.202000555 258 :9 Online publication date: 8-Mar-2021 https://doi.org/10.1002/pssb.202000555
  • Cui Y Xiao D Cline D Loguinov D (2020) Improving I/O Complexity of Triangle Enumeration IEEE Transactions on Knowledge and Data Engineering 10.1109/TKDE.2020.3003259 (1-1) Online publication date: 2020 https://doi.org/10.1109/TKDE.2020.3003259

View options

View or Download as a PDF file.

View online with eReader .

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Share this publication link.

Copying failed.

Share on social media

Affiliations, export citations.

  • Please download or close your previous search result export first before starting a new bulk export. Preview is not available. By clicking download, a status dialog will open to start the export process. The process may take a few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress. Download
  • Download citation
  • Copy citation

We are preparing your search results for download ...

We will inform you here when the file is ready.

Your file of search results citations is now ready.

Your search export query has expired. Please try again.

Browse Econ Literature

  • Working papers
  • Software components
  • Book chapters
  • JEL classification

More features

  • Subscribe to new research

RePEc Biblio

Author registration.

  • Economics Virtual Seminar Calendar NEW!

IDEAS home

Data-driven causal knowledge graph construction for root cause analysis in quality problem solving

  • Author & abstract
  • Related works & more

Corrections

  • Zhaoguang Xu
  • Yanzhong Dang

Suggested Citation

Download full text from publisher.

Follow serials, authors, keywords & more

Public profiles for Economics researchers

Various research rankings in Economics

RePEc Genealogy

Who was a student of whom, using RePEc

Curated articles & papers on economics topics

Upload your paper to be listed on RePEc and IDEAS

New papers by email

Subscribe to new additions to RePEc

EconAcademics

Blog aggregator for economics research

Cases of plagiarism in Economics

About RePEc

Initiative for open bibliographies in Economics

News about RePEc

Questions about IDEAS and RePEc

RePEc volunteers

Participating archives

Publishers indexing in RePEc

Privacy statement

Found an error or omission?

Opportunities to help RePEc

Get papers listed

Have your research listed on RePEc

Open a RePEc archive

Have your institution's/publisher's output listed on RePEc

Get RePEc data

Use data assembled by RePEc

     
 








 

and

, 2023, vol. 61, issue 10, 3227-3245

Root cause analysis (RCA) plays an essential role in quality problem solving (QPS). Due to the difficulty of obtaining causal knowledge of quality problems, companies often rely on expert experience and conventional RCA tools when conducting RCA. Rich QPS data have remained mostly untapped but provide the potential for causal knowledge mining, while the semistructured nature of these data poses enormous challenges to this task. Thus, we propose a data-driven framework to mine large-scale causalities between quality problems and production factors from QPS data and exploit a causal knowledge graph for quality problems (QPCKG) to express these causalities. We first classify QPS data to identify the data containing causality. The causal linguistic patterns are then employed to extract cause slots and effect slots from these data. Subsequently, we apply the BiLSTM-CRF to extract the core content of problems. A vertex fusion method is last proposed to integrate discrete causalities into QPCKG. The approach is validated in a real-world application at a leading automotive company. Three potential applications of the QPCKG are demonstrated for quality diagnosis and prediction. The QPCKG reveals a grand picture of the core interaction mechanism of product quality and production factors and provides decision-making support for RCA.

2023

(external link)
(text/html)
Access to full text is restricted to subscribers.


This item may be available elsewhere in EconPapers: for items with the same title.

BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

This journal article can be ordered from

for this article

International Journal of Production Research is currently edited by

in International Journal of Production Research from
Bibliographic data for series maintained by Chris Longhurst ( ).

graph construction analysis and problem solving

Ben Holland

Call graph construction algorithms explained.

A call graph is an artifact produced by program analysis tools to record the relationships between a function and the functions it calls. If you’ve ever wondered how these call graphs actually get generated then keep reading because in this post I’ll be exploring several call graph construction algorithms and their tradeoffs.

Note: Some issues with the information in the RTA section of this article have been brought to my attention and will be fixed shortly. Thanks to David Bacon for pointing them out to me.

If you just want the nitty-gritty formal stuff then check out the paper by Tip and Palsberg , which inspired this post.

Let’s start by taking a look at a call graph. The figure below shows a call graph for a simple Java program that indicates the Java program has a method B that calls the method C and that C in turn calls methods B and D .

Atlas Call Graph

I’ve implemented each variation of the algorithms discussed in this post in an open source Eclipse plugin project called the call-graph-toolbox , which runs on top of the Atlas program analysis framework. Atlas provides an interface for creating Smart Views that allow you to click on source code or program graphs to instantly produce an updated program graph relevant to what you clicked. The call-graph-toolbox provides each call graph implementation as a Smart View so the differences in the algorithms can be observed visually. I’ll be using call-graph-toolbox to help show the differences of each algorithm discussed in this post.

The Problem(s)

Let’s start by understanding why it might be hard to create a call graph. To keep things simple we are just going to be considering Java, but know that the problem gets harder when we start talking about dynamically typed languages (such as Python or Ruby) or languages with function pointers (such as C, C++, or Objective-C).

Consider the following program.

  • What will it print if we run it?
  • What methods would be called at runtime?
  • What edges should the ideal call graph have?

Static vs. Dynamic Dispatch

In Java, when we call a method, the binding between the callsite and the method to be called may be done at compile time or runtime. If the method binding is resolvable at compile time we call it a static dispatch. If at compile time we can’t resolve the callsite we call it a dynamic dispatch and resolve the callsite later at runtime. Dynamic dispatches are extremely common in Object Oriented languages like Java, so we should spend some time making sure we understand how they work.

Since a static dispatch is resolvable at compile time, we know exactly where to find the code for the method we are calling before we even run the program. In Java, calls to methods marked static and calls to constructors are both static dispatches. Since everything in Java is an object it makes sense that we should always know exactly where the code to construct new objects is located in our program. Methods that are marked static (class methods) do not require that we create an instance of an object to invoke the method. Every executable Java program has at least one static method, the main method, which makes sense because before we run the program we haven’t created any new objects yet, but we still want to invoke some procedure to use as an entry point into the program.

Java methods that are not marked static (instance methods) require an instance of an object. We can imagine a Person class with a method getName() that returns the name of a person. Since we don’t want to assign the same name to all people, the getName() method should be associated with an instance of a person. This means we need to call getName() on a Person object (and not directly on the type like a class method). A call to an instance method is resolved at runtime using a dynamic dispatch and the reason why will become clear soon.

Let’s take a look at the print callsite in the main method of our example program from above. The print method is declared in class A and then overridden by A ’s children in classes B , C , and D . We know that print is an instance method because it is not marked as a static method. If we run this program, the print method declared in class B will get executed. This is because the variable b2 is an object of type B . If you wanted to call the print method declared in class C you could replace b2 with c2 . This means that we have to know the type of the object the instance method is being called on to know which method will get called at runtime!

Now remember that every object in Java can be drawn in a tree hierarchy with java.lang.Object at the very top. The type hierarchy for our small program is shown below.

Type Hierarchy

Any non-private instance method declared in a parent type is inherited by the child, unless that child provides an alternative implementation of the inherited method (by overriding the method). As a result of Java’s type hierarchy we can also assign any object type to a variable of the same type or a variable with the type of any of the object’s parent types (including java.lang.Object ). This means that someone could write the following code.

The instance method toString is declared in java.lang.Object so it can be called on any object type. Since we don’t know the type of the object in variable o we have to assume that java.lang.Object ’s toString method or any object that overrides toString could be called at runtime! Since Java highly encourages developers to override the toString method this leaves us with quite a few possibilities (and only one correct answer).

To answer our questions from above, we would all probably agree that an edge from the print callsite (at b2.print(c2) ) to B ’s print method implementation should be in our call graph because B ’s print method is what gets executed at runtime. We also know that it could be tricky to figure out exactly which method would get called at runtime as a result of a dynamic dispatch because it could be tricky to statically determine the runtime type of b2 . So should we conservatively add edges to A , B , C , and D ’s print methods if we can’t figure out the type of b2 even though A , C , and D ’s print methods are never called? What’s better a call graph with all of the possible edges or a call graph with only the edges we are sure are correct?

We say a call graph is “sound” if it has all the edges that are possible at runtime. We say a call graph is “precise” if it does not have edges that do not occur at runtime. It’s easy to be sound, but its hard to be sound and precise.

Whole vs. Partial Program Analysis

Partial program analysis occurs when we don’t have the entire program or when we can’t scale up our analysis to run over the entire program. You might think your Hello World program is small, but don’t forget you are running it on top of a vast set of libraries provided with the Java Runtime Environment. If you’re really crazy you might also consider the native implementation of the virtual machine itself or the operating system that virtual machine is running on and so on and so on!

In practice sometimes we just want to analyze a specific library, which could be used by many applications, so we are forced to do partial program analysis. Consider our example from before. What would happen if we removed the main method and just consider the classes A , B , C , and D ? Since the main method is the only place any types are actually created, it would look like the four print instance methods are just dead code, but don’t forget that this “library” could be used by any Java program! Since we could conceive of Java programs that could call all four print methods we can’t consider any of them dead code. So with that in mind it is going to be important to specify whether or not we are analyzing a partial program or a whole program when we construct our call graphs.

Idea 1 - Class Hierarchy Analysis (CHA)

Let’s start by building a sound call graph (we’ll worry about precision later). The first algorithm you might think to implement would probably be a variation of Reachability Analysis (RA). The basic idea is that for each method, we visit each callsite and then get the name of the method the callsite is referencing. We then add an edge from the method containing the callsite to every method with the same name as the callsite.

The result is a completely sound call graph, but not a very precise call graph. We might even match callsites of print(Object o) to a method that takes a different number of parameters such as print(Object o1, Object o2) . We could make our matching implementation stronger by matching the entire method signature (method name, method parameter counts/types, return type), but we still are going to have plenty of bad (conservative) matches. Consider the following code snippet with respect to our first example.

A simple reachability analysis would add edges from the print callsite to A , B , C , and D ’s print methods, but we declared b as a B type so we know the dispatch has to at least go to B or a child of B ’s print method. We don’t know what happened in the ... , so it is possible that an instance of a C type could have been assigned to b .

At this point it should be becoming clear that we could improve upon RA by considering the type hierarchy. Class Hierarchy Analysis (CHA) was first proposed in a 1995 paper by Dean, Grove, and Chambers to do just this. Class Hierarchy Analysis looks at the declared type of the variable (the receiver object) used to invoke the dynamic dispatch and restricts call edges to the inherited method implementation or the methods declared in the subtype hierarchy of the declared type of the receiver object. The result is a vast improvement on RA that is still sound and cheap to compute.

I’ve implemented both RA and CHA call graph construction algorithms in my call-graph-toolbox . The results for our first example are shown below. Note that RA picks up an extra print method signature in java.io.PrintStream .

Reachability Analysis Call Graph

Now let’s take a look at how CHA would do when analyzing a library. For the most part, it does pretty well! A class hierarchy analysis doesn’t care about where the types are instantiated, since it only uses the declared type of the receiver object. The only tricky part is that an application could declare a type that overrides an instance method in the library, in which case the dynamic dispatch could actually dispatch to a method outside of the library (an application callback). So how can we create a call edge for a method that doesn’t exist yet!? Well all we have to work with is the fact that whatever method is going to override our library method has to be in a subtype of the library’s type that declared the method being overridden.

Consider the case where a library defines a single abstract class with a method sort . The library contains no subtypes of Data that implement the sort method. Somewhere else in the library a call to the sort method is made. At runtime a subtype of Data will exist with a sort method, but not when the library was compiled. The best place I’ve seen this problem defined is as the Separate Compilation Assumption by Karim Ali .

There is no concrete method implementation for sort in our library, so adding a call edge from sortIt to sort might be a bit misleading (because Data ’s sort method can’t actually be a runtime dispatch target). Instead what we could do is modify CHA to add a “library-call” edge to sort to indicate that any dynamic dispatch that gets resolved at runtime must override sort . Running the call-graph-toolbox algorithm for CHA in library mode (available in the Eclipse preference menu) for this example shows an example of a “library-call” edge. Aside from abstract methods without any concrete method implementations, remember that any non-final method in any non-final class could be overridden by an application to form a callback.

Library Class Hierarchy Analysis Call Graph

Idea 2 - Rapid Type Analysis (RTA)

Class Hierarchy Analysis gives us a sound and cheap call graph. In fact it is the default call graph implementation for most static analysis tools (including Atlas), but can we do better? Remember that a dynamic dispatch has to be called on an instance of an object, which means that in order for a dispatch to be made to a particular type’s instance method at least one object of that type (or subtype) must have been allocated somewhere in the program. This is the core idea behind Rapid Type Analysis (RTA), which was proposed by Bacon and Sweeney (implementation details available in the extended technical report ).

Given a CHA call graph we start at the main method and iteratively construct a new call graph that is a subset of the CHA call graph by adding only the edges to the methods that are contained in types of objects that were allocated in the main method. The methods that are reachable through the newly added edges are added to a worklist and the process is repeated until the worklist is empty.

Methods can be inherited from parent types so we should consider the supertypes of the allocated types as well. RTA considers that a type could be allocated in a method and then passed as a parameter to another method. Since the resolved calling relationships are being built on-the-fly it’s important to note that RTA may evaluate a method several times (if new callers are discovered the method has to be re-evaluated). RTA runs until the worklist is empty, at which point it has reached a fixed point and cannot resolve any new call edges to add to the call graph.

The result of RTA on our first example is shown below.

Rapid Type Analysis

RTA produces a more precise call graph than CHA (because it is a subset of CHA and removes edges that are likely not feasible at runtime), but the result is no longer sound. Let’s consider a few cases where RTA might not produce a sound call graph.

In the example above main calls foo , which allocates a new A type and stores the instance to a field v . Then main calls bar , which accesses the field and calls toString on the instance stored in v . RTA will not include an edge from bar to toString because neither bar or its parents ( main ) allocated any instance that toString could be called on (however some implementations might consider the String[] args to be a special implicit allocation of an array type of String types, but the type A still does not reach bar in the basic RTA implementation).

Let’s consider another example of unsoundness.

In this example main calls foo , which returns an allocation of type A that is then passed as a parameter in the call to bar . Again in this case the call edge to A ’s toString method would be missing because neither bar or its parents ( main ) allocated a type of A .

As you may have guessed we could modify RTA to consider fields and the types of method parameters and method return types to improve the accuracy of RTA, which is exactly what Tip and Palsberg proposed to do in the paper that inspired this post. In their paper, Tip and Palsberg propose to add several more constraints to RTA.

The first set of constraints concerns global variables and forms the basis for the first variation of RTA, Field Type Analysis (FTA). FTA adds the constraint that any method that reads from a field can inherit the field-compatible allocated types of any method that could write to that field.

The second variation of RTA, Method Type Analysis (MTA), adds constraints involving method parameter types and method return types. MTA adds the constraint that types allocated in a method and then passed to a method through a parameter should be compatible with the called method’s parameter types. MTA also adds the constraint that the return type of each called method be added to the set of allocated types. At first this may seem difficult because the return type would appear to depend on the resolution of dynamic dispatches, but the set of potential dynamic dispatches are statically typed to be the same return type.

Finally, we can combine both sets of constraints defined by MTA and FTA to form a Hybrid Type Analysis (XTA). The result of XTA on our first example is shown below. Note that the call graph depth has been expanded to reveal three newly resolved dispatches to getSimpleName , which are not resolved by RTA because the instance of the Class object is returned by getClass .

XTA

The idea of adding constraints to RTA to improve the accuracy and soundness of the base algorithm is an interesting idea and worth pursuing deeper. If we assume the standard Von Neumann model of a computer, then programs can only communicate data through the stack or the heap. FTA adds constraints for communicating through the heap when new allocations are created and stored in a global variable, while MTA adds additional constraints for communicating through the stack when methods are called with parameters and then subsequently return results. One other way to pass information is by creating and throwing an exception. Tip and Palsberg mention exception handling as an implementation issue, but do not propose a set of contraints to use as an approximation for analyzing programs with exceptions.

Since exception handling is extremely common in Java code, I propose another RTA variation, Exception Type Analysis (ETA), which adds the following constraint. Add the allocated and inherited types of any method with a throw site to every method that contains a reachable type compatible catch block in the current reverse call graph of throwing method. Of course this constraint could be added to XTA, so we should differentiate between the classic XTA and XTA with ETA. The result of ETA for the following code snippet is shown below.

Note that none of the RTA variants proposed by Tip and Palsberg specifically address this case and would fail to add the edge to MyException ’s toString method.

ETA

Now let’s play out the idea of adding constraints a bit further. Consider how RTA would address the following snippet of code.

The variable o1 is assigned an allocation of type A and then the variable o2 is assigned an allocation of type B . RTA is only looking at the allocation types, so the fact that o2 could only be a type B is lost in the approximation. Without considering the assignments that were made to the variable o2 it is impossible to rule out any other dispatch possibilities. We are reaching the point where we can no longer add any more constraints to refine the precision of our call graph without trying to directly determine the type of the variable for the callsite, which brings us to Variable Type Analysis (VTA) in the next section.

Before we move on, let’s consider how RTA performs on a library. Since RTA starts with a worklist containing only the main method, a library without a main method would result in an empty call graph without some modifications to RTA. Without the application code that is using the library, we have to assume that any method could be called by the application. Depending on how many assumptions we want to make we could restrict that set to say only public methods in the library, but we’d be ruling out the fact that an application might extend a type and inherit a protected method or use reflection to call a private method.

We should also assume that the application could allocate any type and pass it to a method, which pretty much brings us back to a Class Hierarchy Analysis in terms of precision. FTA and MTA would fair better than RTA alone for this last assumption as they would at least filter those types based on the compatibility of their types. Finally there is still the possibility the library contains calls to abstract methods that do not have any concrete implementations, so RTA should at least include the library calls calculated in CHA.

Idea 3 - Variable Type Analysis (VTA)

The last approach is rather direct. We have a reference to the variable that is used in the callsite, but we need to learn know its type. One rather straightforward idea is to start at every allocation site and record its type. Then we iterate over every assignment building a graph of each variable and the assignment of the allocation type to a variable or an assignment of a variable to a variable. At the end we can use the graph to answer which set of types a variable could point to based on what could have been assigned to that variable during the execution of the program.

The algorithm looks remarkably similar to RTA in that it is a fixed point algorithm using a worklist, but since we are considering much more information than RTA the cost of running VTA is going to be much higher. Without getting too deep into points-to analysis (a subject for another day) it’d be good to note that a VTA implementation could be accomplished with an implementation of a full blown points-to, which tracks the allocation site that each variable in the program could point-to, or a slimmed down version of a points-to that is only tracking the type of the allocation site for each variable in the program.

The precision of VTA is going to be much better than RTA, but depends on a whole new set of hard problems in computer science that is the focus of much of the recent research in this area. I’ll group all of these problems into one bucket and call them “sensitivities”. One such sensitivity is “flow sensitivity”, which refers to whether or not the ordering of the assignments is taken into account. For example if we assigned the result of a new A type allocation to a variable o and then immediately overwrote the value of o with a different allocation type of B , a locally flow-sensitive implementation could take advantage of the fact that o could only be of type B immediately after the second assignment.

For basic blocks it’s possible to be locally flow-sensitive relatively easily using a Static Single Assignment (SSA) form , but things get tricky when we start talking about loops or the ordering of inter-procedural flows (also known as context-sensitivity). In fact, we know a perfectly precise points-to analysis is simply not possible, because the entire problem can be reduced to the halting problem (see Landi , Ramalingam , Reps ).

The call graph toolbox implements a call graph based on the results of a context-insensitive points-to analysis I wrote available in the points-to toolbox . A context-insensitive points-to analysis is called a 0-CFA (when context is defined as calling context). The results for our first example are shown below.

Variable Type Analysis

In general a points-to based call graph will suffer from similar ailments as RTA and its variants when it comes to library analysis. The difficultly in being precise is not knowing what could happen outside of the code being analyzed. Without a main method a points-to analysis has to consider every allocation type as being possible at runtime and propagate that information forward until a fixed point is reached, but this can be incredibly expensive for large applications. Current research focuses on performing precise yet scalable points-to analysis because it is the basis for solving many program analysis tasks (such as alias analysis, type inference, and call graph construction).

Evaluation and Conclusion

So…which call graph construction algorithm should I use? Well let’s run some numbers on a large application and see.

ConnectBot (Android Application)

Personally, when I test out a new program analysis tool, I like to use the Android application ConnectBot because I’m familiar with it, it is open source, it is fairly large (~235 classes, 33K logical lines of code), and it is decently complex. The developer(s) always seem to do something a little differently than you would expect.

For instance, ConnectBot has an abstract class that defines a concrete method, which is then intentionally overridden by all the abstract class’s children. The base method is an empty method implementation that just has a comment to the effect of “all base classes should override this method” , which just pains me to see because without marking the method abstract we can’t actually guarantee that all children will implement the method. It’s a bug waiting to happen if a new developer doesn’t notice this pattern and accidently, but silently inherits the empty implementation from its parent type. This is one of the reasons why we have abstract methods in the first place!

So if it is true that all the children override the concrete method and the base method’s type cannot be instantiated (because it’s abstract) then the base method is actually dead code and should be removed from the CHA result. Thank you ConnectBot developers for making my program analysis code better by writing goofy code, but…

Whyyy

The results of running each call graph construction algorithm on ConnectBot are shown in the table below.

ConnectBot contains 9,966 callsites with 3,938 static dispatches and 6,028 dynamic dispatches. Of the 6,028 dynamic dispatch callsites, I computed the minimum, maximum, and average number of potential dispatch targets each algorithm resolved. The number of nodes refers to the number of methods in the resulting call graph, whereas the number of edges refers to the number of call edge relationships in the resulting call graph.

It’s also important to note that ConnectBot is an Android application, which means it does not have a main method. It is possible to model the Android lifecycle to figure out the application entry points, but that was not done for this analysis and the call graphs were simply produced using the library assumptions stated earlier for each algorithm.

RA 452.96 4,065 30,067 1 91 6.25
CHA 75.38 3,490 9,400 1 31 1.40
RTA 29.03 2,515 5,166 0 5 0.50
FTA 128.12 2,963 6,969 0 12 0.93
MTA 38.04 2,629 5,462 0 5 0.58
ETA 177.02 2,503 5,359 0 11 0.52
XTA 94.52 2,987 6,776 0 10 0.81
XTA + ETA 279.82 2,954 6,860 0 15 0.83
0-CFA (Points-to) 37.51 3,073 6,388 0 9 1.09

The time column should be taken with a grain of salt. I implemented most algorithms focusing on correctness and readability and didn’t optimize the code too heavily (some caching would go a long way). The points-to analysis on the hand is very optimized code and completes in under half a second and the results are then used to reconstruct the call and control flow graphs after the fact.

Aside from timing, the Min column is interesting. In this table we are including library calls (call edges to abstract methods when no implementation is available) so a callsite with no resolvable dispatches means the algorithm is unsound. We should expect this from RTA and its variants and depending on the implementation, points-to analysis as well. In this case, the points-to analysis is unsound because it was not considering new allocations that occur inside the Java Runtime libraries.

The Average column is also interesting. At runtime only one dispatch target is actually possible for a callsite at a given point in the program execution (its possible the callsite location could be a different target at a later point in the execution). So in an ideal world, our static analysis tool would report exactly one potential dispatch target for each callsite (in a given context). An average number of potential dispatch targets that approaches 1.0 is a sign of a precise call graph. In our results the 0-CFA is the closest to the ideal value.

Apache Commons IO (Application Mode)

Let’s analyze one more application just so we can try out whole program analysis. For this project, I downloaded the source code of the Apache Commons IO 2.4 library and imported all library code into an Eclipse project. I then included the HexDumpTest test from the test package. To make things simple I remove the JUnit assertions and added a main method that created a new HexDumpTest and called the testDump . The results are shown in the table below.

The project contained 2,931 callsites, with 1,634 static dispatches and 1,297 dynamic dispatches.

RA 42.64 2,005 12,700 0 80 11.03
CHA 6.95 1,817 5,119 0 80 3.26
RTA 0.07 31 34 0 3 0.02
FTA 0.19 45 51 0 3 0.02
MTA 0.1 30 33 0 3 0.02
ETA 0.18 31 34 0 3 0.02
XTA 0.28 50 58 0 3 0.02
XTA + ETA 0.53 51 60 0 3 0.04
0-CFA (Points-to) 2.3 1,325 2,126 0 8 0.85

In this table the Min , Max , and Average columns should be taken with another grain of salt because we are only considering methods reachable from the main method, so callsites outside of that reachable set of methods are dead code and should not have resolvable targets!

Apache Commons IO (Library Mode)

For comparison, I commented out the HexDumpTest.main method and collected the table statistics again with library analysis enabled. Without the main method, the project contained 2,929 callsites, with 1,633 static dispatches and 1,296 dynamic dispatches.

RA 43.94 2,139 13,874 1 83 12.21
CHA 7.25 1,882 5,298 1 80 3.42
RTA 5.12 1,159 1,812 0 10 0.59
FTA 10.11 1,280 2,201 0 15 0.91
MTA 6.84 1,196 1,837 0 4 0.63
ETA 9.28 1,154 1,784 0 8 0.56
XTA 10.81 1,321 2,138 0 9 0.89
XTA + ETA 14.54 1,321 2,152 0 10 0.91
0-CFA (Points-to) 2.39 1,400 2,303 0 9 1.01

So to answer the question, which algorithm is the best? CHA is a great contender for when soundness is required and we don’t want to invest a lot of computation time. If we can relax the soundness requirement a bit we might consider RTA or one of its variants, but in my experience most research is trending towards figuring out how to scale a points-to analysis for sound-ish and precise results.

Closing Thoughts

Tip and Palsberg present a nice overview diagram in Figure 1 of their paper , which I have reproduced below with some modifications.

Summary

  • Data Structures
  • Linked List
  • Binary Tree
  • Binary Search Tree
  • Segment Tree
  • Disjoint Set Union
  • Fenwick Tree
  • Red-Black Tree
  • Advanced Data Structures

Graph Data Structure And Algorithms

  • Introduction to Graph Data Structure
  • Graph and its representations
  • Types of Graphs with Examples
  • Basic Properties of a Graph
  • Applications, Advantages and Disadvantages of Graph
  • Transpose graph
  • Difference Between Graph and Tree

BFS and DFS on Graph

  • Breadth First Search or BFS for a Graph
  • Depth First Search or DFS for a Graph
  • Applications, Advantages and Disadvantages of Depth First Search (DFS)
  • Applications, Advantages and Disadvantages of Breadth First Search (BFS)
  • Iterative Depth First Traversal of Graph
  • BFS for Disconnected Graph
  • Transitive Closure of a Graph using DFS
  • Difference between BFS and DFS

Cycle in a Graph

  • Detect Cycle in a Directed Graph
  • Detect cycle in an undirected graph
  • Detect Cycle in a directed graph using colors
  • Detect a negative cycle in a Graph | (Bellman Ford)
  • Cycles of length n in an undirected and connected graph
  • Detecting negative cycle using Floyd Warshall
  • Clone a Directed Acyclic Graph

Shortest Paths in Graph

  • How to find Shortest Paths from Source to all Vertices using Dijkstra's Algorithm
  • Bellman–Ford Algorithm
  • Floyd Warshall Algorithm
  • Johnson's algorithm for All-pairs shortest paths
  • Shortest Path in Directed Acyclic Graph
  • Multistage Graph (Shortest Path)
  • Shortest path in an unweighted graph
  • Karp's minimum mean (or average) weight cycle algorithm
  • 0-1 BFS (Shortest Path in a Binary Weight Graph)
  • Find minimum weight cycle in an undirected graph

Minimum Spanning Tree in Graph

  • Kruskal’s Minimum Spanning Tree (MST) Algorithm
  • Difference between Prim's and Kruskal's algorithm for MST
  • Applications of Minimum Spanning Tree
  • Total number of Spanning Trees in a Graph
  • Minimum Product Spanning Tree
  • Reverse Delete Algorithm for Minimum Spanning Tree

Topological Sorting in Graph

  • Topological Sorting
  • All Topological Sorts of a Directed Acyclic Graph
  • Kahn's algorithm for Topological Sorting
  • Maximum edges that can be added to DAG so that it remains DAG
  • Longest Path in a Directed Acyclic Graph
  • Topological Sort of a graph using departure time of vertex

Connectivity of Graph

  • Articulation Points (or Cut Vertices) in a Graph
  • Biconnected Components
  • Bridges in a graph
  • Eulerian path and circuit for undirected graph
  • Fleury's Algorithm for printing Eulerian Path or Circuit
  • Strongly Connected Components
  • Count all possible walks from a source to a destination with exactly k edges
  • Euler Circuit in a Directed Graph
  • Word Ladder (Length of shortest chain to reach a target word)
  • Find if an array of strings can be chained to form a circle | Set 1
  • Tarjan's Algorithm to find Strongly Connected Components
  • Paths to travel each nodes using each edge (Seven Bridges of Königsberg)
  • Dynamic Connectivity | Set 1 (Incremental)

Maximum flow in a Graph

  • Max Flow Problem Introduction
  • Ford-Fulkerson Algorithm for Maximum Flow Problem
  • Find maximum number of edge disjoint paths between two vertices
  • Find minimum s-t cut in a flow network
  • Maximum Bipartite Matching
  • Channel Assignment Problem
  • Introduction to Push Relabel Algorithm
  • Introduction and implementation of Karger's algorithm for Minimum Cut
  • Dinic's algorithm for Maximum Flow

Some must do problems on Graph

  • Find size of the largest region in Boolean Matrix
  • Count number of trees in a forest
  • A Peterson Graph Problem
  • Clone an Undirected Graph
  • Introduction to Graph Coloring
  • Traveling Salesman Problem (TSP) Implementation
  • Introduction and Approximate Solution for Vertex Cover Problem
  • Erdos Renyl Model (for generating Random Graphs)
  • Chinese Postman or Route Inspection | Set 1 (introduction)
  • Hierholzer's Algorithm for directed graph
  • Boggle (Find all possible words in a board of characters) | Set 1
  • Hopcroft–Karp Algorithm for Maximum Matching | Set 1 (Introduction)
  • Construct a graph from given degrees of all vertices
  • Determine whether a universal sink exists in a directed graph
  • Number of sink nodes in a graph
  • Two Clique Problem (Check if Graph can be divided in two Cliques)

Graph Data Structure is a collection of nodes connected by edges . It’s used to represent relationships between different entities. Graph algorithms are methods used to manipulate and analyze graphs, solving various problems like finding the shortest path or detecting cycles.

graph-data-structure

Table of Content

What is Graph Data Structure?

  • Components of a Graph
  • Basic Operations on Graphs
  • Applications of Graph
  • Basics of Graph
  • BFS and DFS in Graph
  • Cycles in Graph
  • Shortest Path in Graph
  • Minimum Spanning Tree
  • Connectivity in Graph
  • Maximum Flow in Graph
  • Some must do Problems on Graph
  • Some Quizzes

Graph is a non-linear data structure consisting of vertices and edges. The vertices are sometimes also referred to as nodes and the edges are lines or arcs that connect any two nodes in the graph. More formally a  Graph  is composed of a set of vertices(  V  ) and a set of edges(  E  ). The graph is denoted by  G(V, E).

Graph data structures are a powerful tool for representing and analyzing complex relationships between objects or entities. They are particularly useful in fields such as social network analysis, recommendation systems, and computer networks. In the field of sports data science, graph data structures can be used to analyze and understand the dynamics of team performance and player interactions on the field.

Components of a Graph:

  • Vertices:  Vertices are the fundamental units of the graph. Sometimes, vertices are also known as vertex or nodes. Every node/vertex can be labeled or unlabeled.
  • Edges:  Edges are drawn or used to connect two nodes of the graph. It can be ordered pair of nodes in a directed graph. Edges can connect any two nodes in any possible way. There are no rules. Sometimes, edges are also known as arcs. Every edge can be labelled/unlabelled.

Basic Operations on Graphs:

Below are the basic operations on the graph:

  • Insertion of Nodes/Edges in the graph – Insert a node into the graph.
  • Deletion of Nodes/Edges in the graph – Delete a node from the graph.
  • Searching on Graphs – Search an entity in the graph.
  • Traversal of Graphs – Traversing all the nodes in the graph.

Applications of Graph:

Following are the real-life applications:.

  •  Graph data structures can be used to represent the interactions between players on a team, such as passes, shots, and tackles. Analyzing these interactions can provide insights into team dynamics and areas for improvement.
  • Commonly used to represent social networks, such as networks of friends on social media.
  • Graphs can be used to represent the topology of computer networks, such as the connections between routers and switches.
  • Graphs are used to represent the connections between different places in a transportation network, such as roads and airports.
  • Graphs are used in Neural Networks where vertices represent neurons and edges represent the synapses between them. Neural networks are used to understand how our brain works and how connections change when we learn. The human brain has about 10^11 neurons and close to 10^15 synapses.

Basics of Graph:

  • Introduction to Graphs
  • Difference between graph and tree

BFS and DFS in Graph:

  • Breadth First Traversal for a Graph
  • Depth First Traversal for a Graph
  • Applications of Depth First Search
  • Applications of Breadth First Traversal
  • Iterative Depth First Search

Cycles in Graph:

  • Detect cycle in a direct graph using colors
  • Union By Rank and Path Compression in Union-Find Algorithm
  • Introduction to Disjoint Set Data Structure or Union-Find Algorithm

Shortest Path in Graph:

  • Dijkstra’s shortest path algorithm
  • Johnson’s algorithm for All-pairs shortest paths
  • Dial’s Algorithm
  • Karp’s minimum mean (or average) weight cycle algorithm

Minimum Spanning Tree:

  • Prim’s Minimum Spanning Tree (MST)
  • Kruskal’s Minimum Spanning Tree Algorithm
  • Difference between Prim’s and Kruskal’s algorithm for MST
  • Applications of Minimum Spanning Tree Problem
  • Minimum cost to connect all cities
  • Boruvka’s algorithm for Minimum Spanning Tree

Topological Sorting:

  • All topological sorts of a Directed Acyclic Graph
  • Kahn’s Algorithm for Topological Sorting
  • Maximum edges that can be added to DAG so that is remains DAG

Connectivity in Graph:

  • Eulerian path and circuit
  • Fleury’s Algorithm for printing Eulerian Path or Circuit
  • Length of shortest chain to reach the target word
  • Find if an array of strings can be chained to form a circle
  • Tarjan’s Algorithm to find strongly connected Components

Maximum Flow in Graph:

  • Karger’s Algorithm- Set 1- Introduction and Implementation
  • Dinic’s algorithm for Maximum Flow

Some must do Problems on Graph:

  • Find length of the largest region in Boolean Matrix
  • Graph Coloring (Introduction and Applications)
  • Vertex Cover Problem | Set 1 (Introduction and Approximate Algorithm)
  • K Centers Problem | Set 1 (Greedy Approximate Algorithm)
  • Hierholzer’s Algorithm for directed graph
  • Check whether a given graph is Bipartite or not
  • Snake and Ladder Problem
  • Boggle (Find all possible words in a board of characters)
  • Hopcroft Karp Algorithm for Maximum Matching-Introduction
  • Minimum Time to rot all oranges

Some Quizzes:

  • Quizzes on Graph Traversal
  • Quizzes on Graph Shortest Path
  • Quizzes on Graph Minimum Spanning Tree
  • Quizzes on Graphs

Quick Links :

  • Top 10 Interview Questions on Depth First Search (DFS)
  • Some interesting shortest path questions
  • Practice Problems on Graphs
  • Videos on Graphs

Recommended:

  • Learn Data Structure and Algorithms | DSA Tutorial

Please Login to comment...

Similar reads, improve your coding skills with practice.

 alt=

What kind of Experience do you want to share?

graph construction analysis and problem solving

  • solidarity - (ua) - (ru)
  • news - (ua) - (ru)
  • donate - donate - donate

for scientists:

  • ERA4Ukraine
  • Assistance in Germany
  • Ukrainian Global University
  • #ScienceForUkraine

search dblp

default search action

  • combined dblp search
  • author search
  • venue search
  • publication search

clear

"Data-driven causal knowledge graph construction for root cause analysis in ..."

Details and statistics.

DOI: 10.1080/00207543.2022.2078748

access: closed

type: Journal Article

metadata version: 2023-08-28

graph construction analysis and problem solving

Please note: Providing information about references and citations is only possible thanks to to the open metadata APIs provided by crossref.org and opencitations.net . If citation data of your publications is not openly available yet, then please consider asking your publisher to release your citation data to the public. For more information please see the Initiative for Open Citations (I4OC) . Please also note that there is no way of submitting missing references or citation data directly to dblp.

Please also note that this feature is work in progress and that it is still far from being perfect. That is, in particular,

  • the lists below may be incomplete due to unavailable citation data,
  • reference strings may not have been successfully mapped to the items listed in dblp, and
  • we do not have complete and curated metadata for all items given in these lists.

JavaScript is requires in order to retrieve and display any references and citations for this record.

Schloss Dagstuhl - Leibniz Center for Informatics

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default . You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.

Unpaywalled article links

unpaywall.org

load links from unpaywall.org

Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy .

Archived links via Wayback Machine

web.archive.org

load content from archive.org

Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy .

Reference lists

crossref.org

load references from crossref.org and opencitations.net

Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org , opencitations.net , and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy , as well as the AI2 Privacy Policy covering Semantic Scholar.

Citation data

load citations from opencitations.net

Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.

OpenAlex data

openalex.org

load data from openalex.org

Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex .

w3c valid html

see also: Terms of Use | Privacy Policy | Imprint

dblp was originally created in 1993 at:

University of Trier

since 2018, dblp has been operated and maintained by:

Schloss Dagstuhl - Leibniz Center for Informatics

the dblp computer science bibliography is funded and supported by:

BMBF

  • Stack Overflow for Teams Where developers & technologists share private knowledge with coworkers
  • Advertising & Talent Reach devs & technologists worldwide about your product, service or employer brand
  • OverflowAI GenAI features for Teams
  • OverflowAPI Train & fine-tune LLMs
  • Labs The future of collective knowledge sharing
  • About the company Visit the blog

Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Get early access and see previews of new features.

What are good examples of problems that graphs can solve better than the alternative? [closed]

After reading Stevey Yegge's Get That Job At Google article, I found this little quote interesting:

Whenever someone gives you a problem, think graphs. They are the most fundamental and flexible way of representing any kind of a relationship, so it's about a 50–50 shot that any interesting design problem has a graph involved in it. Make absolutely sure you can't think of a way to solve it using graphs before moving on to other solution types. This tip is important!

What are some examples of problems that are best represented and/or solved by graph data structures/algorithms?

One example I can think of: navigation units (ala Garmin, TomTom), that supply road directions from your current location to another, utilize graphs and advanced pathing algorithms.

What are some others?

  • data-structures
  • graph-theory

George Stocker's user avatar

  • 3 By the way, don't buy those myths about Google interviews. Compared to other places, they sometimes ask super simple and straightforward questions, which can actually throw you off. –  Uri Commented Apr 1, 2009 at 4:08

19 Answers 19

Computer Networks: Graphs model intuitively model computer networks and the Internet. Often nodes will represent end-systems or routers, while edges represent connections between these systems.

Data Structures: Any data structure that makes use of pointers to link data together is making use of a graph of some kind. This includes tree structures and linked lists which are used all the time.

Pathing and Maps: Trying to find shortest or longest paths from some location to a destination makes use of graphs. This can include pathing like you see in an application like Google maps, or calculating paths for AI characters to take in a video game, and many other similar problems.

Constraint Satisfaction: A common problem in AI is to find some goal that satisfies a list of constraints. For example, for a University to set it's course schedules, it needs to make sure that certain courses don't conflict, that a professor isn't teaching two courses at the same time, that the lectures occur during certain timeslots, and so on. Constraint satisfaction problems like this are often modeled and solved using graphs.

Molecules: Graphs can be used to model atoms and molecules for studying their interaction and structure among other things.

MahlerFive's user avatar

  • 5 +1 for Constraint Satisfaction. –  hasan Commented Nov 19, 2009 at 20:11
  • 1 The part on constraint satisfaction sounds interesting. Do you have any articles or papers on this topic to share? –  Markus Johnsson Commented Feb 8, 2011 at 22:03
  • @MahlerFive, I'd love to learn more about constraint satisfaction that as well. Could you please point to the resources worth looking into? –  Uzbekjon Commented Feb 25, 2012 at 18:33

I am very very interested in graph theory and ive used it solved so many different kinds of problem. You can solve a lot of Path related problem, matching problem, structure problems using graph.

Path problems have a lot of applications.

This was in a career cup's interview question. Say you want to find the longest sum of a sub array. For example, [1, 2, 3, -1] has the longest sum of 6. Model it as a Directed Acyclic Graph ( DAG ), add a dummy source, dummy destination. Connect each node with an edge which has a weight corresponding to the number. Now use the Longest Path algorithm in the DAG to solve this problem.

Similarly, Arbitrage problems in financial world or even geometry problems of finding the longest overlapping structure is a similar path problem.

Some obvious ones would be the network problems (where your network could have computers people, organisation charts, etc). You can glean a lot of structural information like

  • which point breaks the graph into two pieces
  • what is the best way to connect them
  • what is the best way to reach one place to another
  • is there a way to reach one place from another, etc.

I've solved a lot of project management related problems using graphs. A sequence of events can be pictured as a directed graph (if you don't have cycles then thats even better). So, now you can

  • sort the events according to their priority
  • you can find the event that is the most crucial (that is would free a lot of other projects)
  • you can find the duration needed to solve the total project (path problem), etc.

A lot of matching problems can be solved by graph. For example, if you need to match processors to the work load or match workers to their jobs. In my final exam, I had to match people to tables in restaurants. It follows the same principle (bipartite matching -> network flow algorithms). Its simple yet powerful.

A special graph, a tree , has numerous applications in the computer science world. For example, in the syntax of a programming language, or in a database indexing structure.

Most recently, I also used graphs in compiler optimization problems. I am using Morgan's Book, which is teaching me fascinating techniques.

The list really goes on and on and on. Graphs are a beautiful math abstraction for relation . You really can do wonders, if you can model it correctly. And since the graph theory has found so many applications, there are many active researches in the field. And because of numerous researches, we are seeing even more applications which is fuelling back researches.

If you want to get started on graph theory, get a good beginner discrete math book ( Rosen comes to my mind), and you can buy books from authors like Fould or Even . CLRS also has good graph algorithms.

nbro's user avatar

Your source code is tree structured, and a tree is a type of graph. Whenever you hear people talking about an AST (Abstract Syntax Tree), they're talking about a kind of graph.

Pointers form graph structures. Anything that walks pointers is doing some kind of graph manipulation.

The web is a huge directed graph. Google's key insight, that led them to dominate in search, is that the graph structure of the web is of comparable or greater importance than the textual content of the pages.

State machines are graphs. State machines are used in network protocols, regular expressions, games, and all kinds of other fields.

It's rather hard to think of anything you do that does not involve some sort of graph structure.

Brian Campbell's user avatar

An example most people are familiar: build systems. Make is the typical example, but almost any good build system relies on a Directed Acyclic Graph. The basic idea is that the direction models the dependency between a source and a target, and you should "walk" the graph in a certain order to build things correctly -> this is an example of topological sort.

Another example is source control system: again based on a DAG. It is used for merging, for example, to find common parent.

Nick Johnson's user avatar

Well, many program optimization algorithms that compilers use are based on graphs (e.g., figure out call graph, flow control, lots of static analysis).

Many optimization problems are based on graph. Since many problems are reducable to graph colouring and similar problems, then many other problems are also graph based.

I'm not sure I agree that graphs are the best way to represent every relation and I certainly try to avoid these "got a nail, let's find a hammer" approaches. Graphs often have poor memory representations and many algorithms are actually more efficient (in practice) when implemented with matrices, bitsets, and other things.

Uri's user avatar

OCR. Picture a page of text scanned at an angle, with some noise in the image, where you must find the space between lines of text. One way is to make a graph of pixels, and find the shortest path from one side of the page to the other, where the difference in brightness is the distance between pixels.

This example is from the Algorithm Design Manual , which has lots of other real world examples of graph problems.

RossFabricant's user avatar

One popular example is garbage collection.

The collector starts with a set of references, then traverses all the objects they reference, then all the objects referenced there and so on. Everything it finds is added into a graph of reachable objects. All other objects are unreachable and collected.

sharptooth's user avatar

To find out if two molecules can fit together. When developing drugs one is often interested in seeing if the drug molecules can fit into larger molecules in the body. The problem with determining whether this is possible is that molecules are not static. Different parts of the molecule can rotate around their chemical bindings so that a molecule can change into quite a lot of different shapes.

Each shape can be said to represent a point in a space consisting of shapes. Solving this problem involves finding a path through this space. You can do that by creating a roadmap through space, which is essentially a graph consisting of legal shapes and saying which shape a shape can turn into. By using a A* graph search algorithm through this roadmap you can find a solution.

Okay that was a lot of babble that perhaps wasn't very understandable or clear. But my point was that graphs pop up in all kinds of problems.

Erik Engheim's user avatar

Graphs are not data structures. They are mathematical representation of relations. Yes, you can think and theoretize about problems using graphs, and there is a large body of theory about it. But when you need to implement an algorithm, you are choosing data structures to best represent the problem, not graphs. There are many data structures that represent general graphs, and even more for special kinds of graphs.

In your question, you mix these two things. The same theoretical solution may be in terms of graph, but practical solutions may use different data structures to represent the graph.

J S's user avatar

The following are based on graph theory:

  • Binary trees and other trees such as Red-black trees, splay trees, etc.
  • Linked lists
  • Anything that's modelled as a state machine (GUIs, network stacks, CPUs, etc)
  • Decision trees (used in AI and other applications)
  • Complex class inheritance

IMHO most of the domain models we use in normal applications are in some respect graphs. Already if you look at the UML diagrams you would notice that with a directed, labeled graph you could easily translate them directly into a persistence model. There are some examples of that over at Neo4j

Peter Neubauer's user avatar

Social connections between people make an interesting graph example. I've tried to model these connections at the database level using a traditional RDMS but found it way too hard. I ended up choosing a graph database and it was a great choice because it makes it easy to follow connections (edges) between people (nodes).

Henri Liljeroos's user avatar

Graphs are great for managing dependencies.

I recently started to use the Castle Windsor Container, after inspecting the Kernel I found a GraphNodes property. Castle Windsor uses a graph to represent the dependencies between objects so that injection will work correctly. Check out this article .

I have also used simple graph theory to develop a plugin framework, each graph node represent a plugin, once the dependencies have been defined I can traverse the graph to create a plugin load order.

I am planning on changing the algorithm to implement Dijkstra's algorithm so that each plugin is weighted with a specific version, thus a simple change will only load the latest version of the plugin.

I with I had discovered this sooner. I like that quote "Whenever someone gives you a problem, think graphs." I definitely think that's true.

Rohan West's user avatar

Profiling and/or Benchmarking algorithms and implementations in code.

Jeremy Wall's user avatar

Anything that can be modelled as a foreign key in a relational database is essentially an edge and nodes in a graph.

Maybe that will help you think of examples, since most things are readily modelled in a RDBMS.

Randy's user avatar

You could take a look at some of the examples in the Neo4j wiki,

http://wiki.neo4j.org/content/Domain_Modeling_Gallery

and the projects that Neo4j is used in (the known ones)

http://wiki.neo4j.org/content/Neo4j_In_The_Wild .

Otherwise, Recommender Algorithms are a good use for Graphs, see for instance PageRank, and other stuff at

https://github.com/tinkerpop/gremlin/wiki/pagerank

RamC's user avatar

  • 2 This answer is a great example of why link only answers should not be well received on stack overflow. All of your links are broken now. –  mikeazo Commented Apr 23, 2015 at 18:15

Analysing transaction serialisability in database theory.

MalcomTucker's user avatar

You can utilise graphs anywhere you can define the problem domain objects into nodes and the solution as the flow of control and/or data amongst the nodes.

Considering the fact that trees are indeed connected-acyclic graphs, there are even more areas you can use the graph theory.

Sujoy's user avatar

Basically nearlly all common data structures like trees, lists, queues, etc, can be thought as type of graph, some with different type of constraint.

To my experiences, I have used graph intensively in network flow problems , which is used in lots of areas like telecommunication network routing and optimisation, workload assignment , matching , supply chain optimisation and public transport planning .

Another interesting area is social network modelling as previous answer mentioned.

There are far more, like integrated circuit optimisation , etc.

Wei's user avatar

Not the answer you're looking for? Browse other questions tagged data-structures graph graph-theory or ask your own question .

  • Featured on Meta
  • Upcoming sign-up experiments related to tags
  • Policy: Generative AI (e.g., ChatGPT) is banned
  • The [lib] tag is being burninated
  • What makes a homepage useful for logged-in users

Hot Network Questions

  • How will the ISS be decommissioned?
  • Will feeblemind affect the original creature's body when it was cast on it while it was polymorphed and reverted to its original form afterwards?
  • A 90s (maybe) made-for-TV movie (maybe) about a group of trainees on a spaceship. There is some kind of emergency and all experienced officers die
  • Is there any legal justification for content on the web without an explicit licence being freeware?
  • DSP Puzzle: Advanced Signal Forensics
  • Tombs of Ancients
  • Weird behavior by car insurance - is this legit?
  • Would the category of directed sets be better behaved with the empty set included, or excluded?
  • What kind of sequence is between an arithmetic and a geometric sequence?
  • Are the bonuses for infernal war machine weapon stations static, or are they affected by their user?
  • Are there paintings with Adam and Eve in paradise with the snake with legs?
  • What was the first game to intentionally use letterboxing to indicate a cutscene?
  • Fantasy TV series with a male protagonist who uses a bow and arrows and has a hawk/falcon/eagle type bird companion
  • How do you say "living being" in Classical Latin?
  • Interchangeability of る and ている
  • Old book about a man who finds an abandoned house with a portal to another world
  • Were there engineers in airship nacelles, and why were they there?
  • Are both vocal cord and vocal chord correct?
  • À + infinitive at start of sentence
  • Visit USA via land border for French citizen
  • How to fix misaligned objects that look fine in viewport but not in render?
  • How do guitarists remember what note each string represents when fretting?
  • What does ‘a grade-hog’ mean?
  • Cleaning chain a few links at a time

graph construction analysis and problem solving

IMAGES

  1. CS ASSESSMENT Problem Set on Graph Construction Analysis and Problem

    graph construction analysis and problem solving

  2. Process of knowledge graph construction.

    graph construction analysis and problem solving

  3. 1Ae-105-MMW-CS-ASSESSMENT-Graph-Construction-Analysis-and-Problem

    graph construction analysis and problem solving

  4. An example of the topic graph construction

    graph construction analysis and problem solving

  5. Semantic-based graph construction model

    graph construction analysis and problem solving

  6. SOLUTION: Cs assessment graph construction analysis

    graph construction analysis and problem solving

VIDEO

  1. kinematics_04 . GRAPHICAL ANALYSIS

  2. Graphical Method of truss analysis

  3. Line graphs

  4. 03. Autodesk Robot Structural Analysis professional Tutorials

  5. MA Module 6, Video 2, Graphing Cost Behaviour, Problem 6-1A

  6. Simple Graphic Truss Analysis Part 3

COMMENTS

  1. Data-driven graph construction and graph learning: A review

    3. Graph construction and learning methods. Given a vertex set V, two main steps are generally conducted to construct a graph.(1) Determine the topological structure of the graph, i.e., the edge set E (E-step for short); (2) Based on the current edge set, determine the weight matrix W (W-step for short). For some scenarios or methods, there may be no principled boundary between these two steps.

  2. Data-driven causal knowledge graph construction for root cause analysis

    Quality problem solving (QPS) (Xu, Dang, and Munro 2018) is one of the most impo... 1. Quality problems plague companies enhancing their competitiveness and negatively impacting financial performance. ... Data-driven causal knowledge graph construction for root cause analysis in quality problem solving. Zhaoguang Xu Institute of System ...

  3. Robust Graph Construction

    Abstract. Graphs have been widely applied in modeling the relationships and structures in real-world applications. Graph construction is the most critical part in these models, while how to construct an effective graph is still an open problem. In this chapter, we propose a novel approach to graph construction based on two observations.

  4. Graph Construction Based on Local Representativeness

    Graph construction is a known method of transferring the problem of classic vector data mining to network analysis. The advantage of networks is that the data are extended by links between certain (similar) pairs of data objects, so relationships in the data can then be visualized in a natural way. In this area, there are many algorithms, often ...

  5. PDF A Graph-Constructive Approach to Solving Systems of Geomeric Constraints

    The problem can be coded as a constraint graph G = (V, E), in which the graph nodes are the geometric elements and the constraints are the graph edges. The edges of the graph are labeled with the values of the distance and angle dimensions. Example1 Figure 2 shows a dimensioned sketch defming a constraint problem involving 4 lines and 6 points.

  6. CS ASSESSMENT Graph Construction Analysis and Problem Solving 1.pdf

    1. Find A layout of a floor plan of an office space (one floor only). Draw a graph (with complete label) that represents the floor plan, where each vertex represents a room in the office space and an edge connects two vertices if there is a doorway between the two rooms. Then answer the following questions: Questions: a. Is it possible to walk through the office and pass through each doorway ...

  7. GRAPH CONSTRUCTION. ANALYSIS AND PROBLEM SOLVING.pdf

    2MATHMWORLD GRAPH CONSTRUCTION, ANALYSIS, PROBLEM SOLVING NAME: _____ SECTION: _____ Answer the following problems. Show your complete solution or provide an explanation to support your answer. 1. Refer to the given graph below. a b e g c d f a. How many edges and vertices are there in the graph?

  8. Analysing complex engineering situations through problem graph

    Hereafter we mention the updating and up keeping of information relevant to problem solving process in the graph. 2.3. Problem graph updates and issues Our goal is to maintain largest amount of information possible regarding the project for a given amount of time because companies a lot very limited time for initial situation analysis. Graph is ...

  9. PDF G. GraphicalandNumericalMethods

    raphical and numerical methods. Carried out by hand, the graphical methods give rough qualitative information about how the graphs of solut. ons to (1) look geometri-cally. The numerical methods then give the actual graphs to as great an accuracy as desired; the computer does the numerica. work, and plots t.

  10. Data-driven causal knowledge graph construction for root cause analysis

    The result shows that there is a lack of utilisation of data, information, and knowledge in problem-solving. Based on the analysis result and the demands of plants for improving the efficiency and ...

  11. Data-driven causal knowledge graph construction for root cause analysis

    A data-driven framework to mine large-scale causalities between quality problems and production factors from QPS data and exploit a causal knowledge graph for quality problems (QPCKG) to express these causalities. Root cause analysis (RCA) plays an essential role in quality problem solving (QPS). Due to the difficulty of obtaining causal knowledge of quality problems, companies often rely on ...

  12. Graph problems

    All the models dealt with here are based on the definition of a graph. A graph is an abstract concept, a construction derived from vertices and edges linking two vertices, but many of the practical optimization problem can be defined naturally by means of graphs. ... When solving the graph coloring problem with a mathematical optimization ...

  13. A graph-constructive approach to solving systems of geometric

    The analysis phase of the algorithm is described in detail, its correctness is proved, and an efficient algorith to realized it is presented. The scope of the graph analysis is then extended by utilizing semantic information in the form of anlge derivations, and by extending the repertoire of the construction steps.

  14. Data-driven causal knowledge graph construction for root cause analysis

    Downloadable (with restrictions)! Root cause analysis (RCA) plays an essential role in quality problem solving (QPS). Due to the difficulty of obtaining causal knowledge of quality problems, companies often rely on expert experience and conventional RCA tools when conducting RCA. Rich QPS data have remained mostly untapped but provide the potential for causal knowledge mining, while the ...

  15. Data-driven causal knowledge graph construction for root cause analysis

    Data-driven causal knowledge graph construction for root cause analysis in quality problem solving. Zhaoguang Xu and Yanzhong Dang. International Journal of Production Research, 2023, vol. 61, issue 10, 3227-3245 . Abstract: Root cause analysis (RCA) plays an essential role in quality problem solving (QPS). Due to the difficulty of obtaining causal knowledge of quality problems, companies ...

  16. Initial situation analysis through problem graph

    1. Introduction1.1.. Problem formulation as a design stageWhile often described as an important aspect in the design process, problem formulation as a design stage is poorly exploited by designers [1].It often occurs at the early stages of a project and is crucial for affecting the direction and success of all succeeding stages [2], [3], [4].If weakly or poorly considered, the result is often ...

  17. Call Graph Construction Algorithms Explained

    The call graph toolbox implements a call graph based on the results of a context-insensitive points-to analysis I wrote available in the points-to toolbox. A context-insensitive points-to analysis is called a 0-CFA (when context is defined as calling context). The results for our first example are shown below.

  18. Graph Data Structure And Algorithms

    Graph is a non-linear data structure consisting of vertices and edges. The vertices are sometimes also referred to as nodes and the edges are lines or arcs that connect any two nodes in the graph. More formally a Graph is composed of a set of vertices ( V ) and a set of edges ( E ). The graph is denoted by G (V, E).

  19. "Data-driven causal knowledge graph construction for root cause ...

    Bibliographic details on Data-driven causal knowledge graph construction for root cause analysis in quality problem solving. Stop the war! Остановите войну! solidarity - - news ... Data-driven causal knowledge graph construction for root cause analysis in quality problem solving. Int. J. Prod. Res. 61 (10): 3227-3245 (2023) a ...

  20. data structures

    Well, many program optimization algorithms that compilers use are based on graphs (e.g., figure out call graph, flow control, lots of static analysis). Many optimization problems are based on graph. Since many problems are reducable to graph colouring and similar problems, then many other problems are also graph based.

  21. CS ASSESSMENT Problem Set on Graph Construction Analysis and Problem

    View CS ASSESSMENT_ Problem Set on Graph Construction, Analysis, and Problem Solving (2).docx from MATH 123 at Holy Angel University. Answer the following: 1. Using the graph in Figure 1, is there

  22. From data to knowledge: Construction process analysis through

    Construction objects are linked time- and location-wise in a knowledge graph representing the construction phase. • The knowledge graph is facilitated to extract construction metrics and identify potential processual bottlenecks. • Additional data sources, like temperature and wind data, are harnessed and fused with the knowledge graph to ...

  23. PDF Analyzing Graphs 6-4 Practice and Problem Solving: A/B

    5. Graph 2. Sample answer: The car eases into traffic and has to stop. After a while, it starts up fast and then stops. 6. 7. 20 mi Practice and Problem Solving: C 1. Graph C 2. Graph A 3. Graph B; Sample answer: Drives around and to the gas station. Fills his gasoline tank. Drives around some more. enough to buy a gym membership. Then I 4.

  24. Construction of Event Graph for Ship Collision Accident Analysis to

    Knowledge graphs: Based on the construction of a ship collision accident knowledge graph, the objective correlation between various factors in the accident was obtained, and on this basis, ... and accident conduction path analysis. 2. Problem Formulation 2.1. Problem Description.

  25. PDF Supreme Court of The United States

    as if the analysis would be correct if only it were belabored. Post, at 34. And yet it is the dissent that relegates the text of the relevant statute, §1123(b), to a pair of footnotes bookending a 25-page exposition on collec-tive-action problems and public policy, one that precedes any effort to engage with our statutory analysis. See . post