Can A* be used to generate algorithm sourcecode?
I thought of an idea to use a* algorithm to "walk" toward a solution that solves a programming problem. Each neighbour can be a weighted list of programming tasks such as if statement, for each, assignment, recursive call. How do you map example data structures transformations to code that fulfils the changes to the data structure? You need a cost function that decreases when the output gets nearer to the solution. What does the cost function for code solving a problem look similar to?
If you project the desired code variables onto a multidimensional space and with a time dimension, where each variable has a assignment algebra you can imagine the input code causing movements to the output algebra for each variable
How do you measure that a solution gets nearer to the solution when you haven't planned what you could do
It should be a multidimensional walk toward a solution that solves the programming problem.
I want to generate the btree algorithm and other algorithms with this approach
I should provide example inputs to output mappings and join the Data between the input and output and try recalculate the processing steps to go from one to the other but I am having problems deciding what the cost function looks like.
We can scan the output structure and generate facts of each relationship of data.
We can compare where the data moved through based on a path traversal of source data to destination data. Generate a patch.
This has a pointer to this object to this piece of data.
Input data
Node1 is root
Node1 children node2
Node1 children node3
Example 1
Insert a node node4
Desired Output data
Node1 is root
Node1 children node2
Node1 children node3
Node1 children node4
Patch
Node4 inserted into node1 children
Why node1? Theorise.
Node1 is root
Run operators against node1, node4
Len(node1.children) <= Maxsize
Example 2
Node split - insert node5
Input data
Root node is node1
Node1 children node2
Node1 children node3
Node1 children node4
Desired target data
Root node is node3
Node3 children node1
Node3 children node2
Node3 children node4
Node4 children node5
This represents a node split as each node can only have 3 items inside it.
How do we learn the rule that a node can only have 3 nodes?
Insert a constant into the system of facts
MaxSize is 3 Insert operators in system Len(node.children) MaxSize == Equal to
Run every permutation of each operator to decide to do patch command steps.
Patch instructions Node1 is no longer root node Node3 is root node Node2 children is node1 Remove node4 from node1 Why was node4 removed from node1 Why was node4 added to node3
Theorise. What fact was true. Compare node1 properties to node4 Eventually.... Len(node1.children) == MaxSize Add node4 to node3 Why? Theorise. What fact is true Node4 >= Node3 true Does >= hold for all examples? Or is it more complicated Twist operation. One goes down, one goes up and going down joins one going down. Root = A Node1 is A New root = B
Twist operation is - Root = B B.children = A
And see where the data moves. We need to generate the steps that deterministically creates the same structure with the same data in the right places.
There shall be patterns. So there is usually a property in the object that is compared against to do a btree split.
So there is at most one field or a comparison that determines the destination of the object in the btree.
I thought we can randomly generate condition statements for each input object.
Walk the patch and insert condition objects.
It's a graph transformation problem
I had a thought. If you imagine the output generated code as a superset then every line of code is part of a subset of the output code.
So you need some way of generating subsets of the output set.
Each set is an ordered set of instructions that generate the answer. I want to avoid brute force search, every attempt of generating a line of code should get nearer to the output.
I think we have to provide heuristics of approximately what the code should do.