------------------------------------------------------------------------
Problem: Identify substructures in MIP problems that induce integrality.
------------------------------------------------------------------------

Motivation: By branching toward such substructures, integrality can be
achieved early in the B&B tree.

Sketch of idea: Certain constraint matrices (e.g., totally unimodular,
network flow) produce integer polyhedra. If branching reduces the
constraint matrix of an MIP problem to such a matrix, and if the
right-hand-side at that time is integral, then it is guaranteed that
all variables take on integer values. Thus, identifying such special
submatrices may be useful for solving IPs. For MIPs the situation is
more complicated. Here, the approach may be useful if the constraint
matrix has block-triangular structure where the top diagonal block is
associated with the integer variables and where that block has a
special submatrix of large size.  Branching then attempts to make that
submatrix the effective constraint matrix for the integer variables.

A related idea concerns matrix decompositions. Here, one tries to find
structures that, after fixing of just a few variables, result in
independent blocks. This can be done rather effectively.  These
variables also are candidates for backdoors. Note that the idea works
regardless of the specific objective function. Subsequent branching is
always done within a block untill all variables with integrality
requirement do become integral.

Difficulty: Finding a largest column submatrix with the special
property is difficult.  One might investigate heuristics. For example,
one might investigate a branching rule that emphasis integer variables
with coefficients in short inequalities of length greater than or
equal to 3. The hope is that, once all such inequalities have been
eliminated, a network flow matrix is at hand. Actually, the remaining
problem could be handled by the matching algorithm, so one may want to
derive blossom inequalities if the resulting matrix is not a network
flow matrix.

On the other hand, there are very fast heuristics for matrix
decompositions that work very well.  Furthermore, software exists
already for these heuristics, so the decomposition idea can be tested
quickly and with little overall effort.

-------------------------------------------------------------
Problem: Identify backdoor variables or other attractive branching
variables via machine learning.
-------------------------------------------------------------

Motivation: Machine learning is able to discover hidden important
relationships in data. Possibly this applies also to learning
effective branching conditions.

Sketch of idea: After a problem has been solved, one may investigate
the path from the root to the solution node, or possibly to other
nodes as well.  For each node, one determines a characteristic vector
and also decides which branching rule of a collection of rules would
have made the right branching decision. One then analyses the
characteristic vectors using one or more machine learning schemes to
determine a formula that would have selected the correct branching
decision. Example machine learning techniques are neural nets, support
vector machines, naive Bayes methods, methods computing logic
separation formulas, statistical schemes.

Slightly modified, the above idea amay also pply to detection of
backdoor variables.  Here, one takes the path from the root to the
solution node and by repeated solving determines a minimum set of
fixed variables that already induces the optimal solution or that
makes the major progress toward that solution.  One then identifies
characteristic vectors for the variables, compares them with the
characteristics of the other variables, and by machine learning
determines a formula that can tell the difference.

Difficulty: It maybe be that characteristics containing the essential
information are too complex to defy detection. Also, the machine
learning method must be able to produce compromise formulas that do
not work perfectly but still are useful. Such compromise formulas must
minimize the total cost of errors, at least approximately.

-------------------------------------------------------------
An additional problem
-------------------------------------------------------------

Consider your favorite NP-Complete (NP-Hard) problem: e.g., TSP,  a
latin square completion problem, a  network problem, etc.

Consider a poly-time sub-solver: e.g., LP, unit-propagation, a network
flow algorithm, etc.

Design families of syntactic parametrized instances and show
interesting  results on the existence or non-existence of backdoor
sets of a given size. Ideally we  would like intuitions on small
backdoor sets (e.g., log n).

References:

For  a  definition of weak backdoor, strong backdoor, and sub-solver
see paper "Backdoors to Typical Case Complexity" (Williams, Gomes, and
Selman)

http://www.cs.cornell.edu/gomes/papers/backdoors.pdf

Structure and Problem Hardness: Goal Asymmetry and DPLL Proofs in
Sat-Based Planning Hoffmann, Gomes, and Selman, 2006 (This paper
provides intuitions for backdoors in syntactic planning problems,
considering unit-propagation as a sub-solver)

http://www.cs.cornell.edu/gomes/papers/ICAPSb-jan26.pdf

Backdoor Sets for DLL Subsolvers (S. Szeider)
http://www.dur.ac.uk/stefan.szeider/abstract19.html

Detecting Backdoor Sets with Respect to Horn and Binary Clauses
http://www.dur.ac.uk/stefan.szeider/abstract16.html