Coherent Zero-Shot Visual Instruction Generation

University of Maryland, College Park
Under submission

Visual instruction generation

Teaser

Consistent Image Generation

Select a sample to view:
Select a baseline to compare:
Inconsistent object
Text misalignment
Image 1
Image 1
Image 1
Image 1
Baseline
Image 1
Image 1
Image 1
Image 1
Ours

Step 1

Step 2

Step 3

Step 4

Steps
1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0
State similarity

Concatenate Actions and States

Select a sample to view:
Select a baseline to compare:
Inconsistent object
Text misalignment
Image 1
Image 1
Image 1
Image 1
Baseline
Image 1
Image 1
Image 1
Image 1
Ours

Step 1

Step 2

Step 3

Step 4

Steps

Step 1

Step 2

Step 3

Step 4

States

Baseline Comparison

Select a sample to view:
Image 1
Image 1
Image 1
Image 1
GII
Image 1
Image 1
Image 1
Image 1
GenHowTo
State
Image 1
Image 1
Image 1
Image 1
GenHowTo
Action
Image 1
Image 1
Image 1
Image 1
Ours

Step 1

Step 2

Step 3

Step 4

Steps

More visual instructions

Select a sample to view:
Image 1
Image 1
Image 1
Image 1
Ours

Step 1

Step 2

Step 3

Step 4

Steps