Coherent Zero-Shot Visual Instruction Generation
Quynh Phung
Fu Yu
Songwei Ge
Jia-Bin Huang
University of Maryland, College Park
Under submission
Paper
Visual instruction generation
Consistent Image Generation
Select a sample to view:
Select a baseline to compare:
Independent
Key-Value
→
Inconsistent object
→
Text misalignment
Baseline
Ours
Step 1
Step 2
Step 3
Step 4
Steps
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
State similarity
Concatenate Actions and States
Select a sample to view:
Select a baseline to compare:
Action
Concatenate Actions
→
Inconsistent object
→
Text misalignment
Baseline
Ours
Step 1
Step 2
Step 3
Step 4
Steps
Step 1
Step 2
Step 3
Step 4
States
Baseline Comparison
Select a sample to view:
GII
GenHowTo
State
GenHowTo
Action
Ours
Step 1
Step 2
Step 3
Step 4
Steps
More visual instructions
Select a sample to view:
Ours
Step 1
Step 2
Step 3
Step 4
Steps