Table 1: Details of example automata states as described by domain experts.
Automata State State Description Question-Type In Dataset
greeting Shopper greets -
self-info Shopper gives information about him/herself -
give-requirement Shopper describes his requirements -
show-image Agent shows relevant responses for the Shopper's query show_result
give-image-description Agent generates short description of the product, using visual and catalog information -
Like/Dislike specific items / overall items and show-more Shopper expresses preference towards one or more of the last results or previously shown results and possibly modifies his requirements (optionally by posting a new image) and wants to see more like_n_show_result or do_not_like_n_show_result or like_earlier_show_result or do_not_like_earlier_show_result
show-orientation Shopper wants to see an item from different orientations show_orientation
show-similar Shopper wants to see similar to a particular item show_similar_to
goes-with Shopper asks for style-tip go_with
ask-attribute Shopper asks about the attributes of the items shown ask_attribute
suited-for Shopper asks for other types of meta-info about the items shown suited_for
sort-results Shopper wants to sort the result set by some attribute sort_results
filter-results Shopper wants to filter the results based on some attribute filter_results
celebrity Shopper asks questions relating to some celebrities and his fashion items of interest celebrity
switch-synset Shopper wants to switch back to the type of fashion synset he had seen previously -
buy Shopper wants to buy one or more items buy
exit Shopper wants to exit -

Table 2: Domain Specific Knowledge Base Stats
Knowledge Base Statistics Examples
Number of Items Crawled 1.05 Mil. -
Avg. Number of Images per item 4 -
Number of Fashion Synsets 716 shirt, trouser, tuxedo, loafer, stilletoes, sunglasses, handbag, earring
Number of Fashion Attributes 52 color, pattern, material, brand, style, dial material, band color, lens material, price, wash-care,
Number of Visual Fashion Attributes 45 color, pattern, material, style, neck, sleeves, length, sole type, closure
Number of Meta-Info Attributes 8 price, wash-care, product ranking, brand, size, suitable occasions
Avg. Number of Fashion Attribute per Synset 16 -
Avg. Number of values per attribute 4200 -
Number of Coarse-Grained StyleTips (Synset, Synset association) 8871 shirt trouser, tshirt sneakers, tuxedo cufflinks, suit formal shoes, kurta jeans
Number of Fine-Grained StyleTips (Attribute Synset, Attribute Synset association) 350 K white shirt black trousers, light t-shirt dark jacket, black gown and silver stilettoes
Number of Celebrity profiles 411 Celeb1 likes vegan-leather shoes, monochrome pastel shaded t-shirts and polyester jackets
Avg. Number of Fashion Synsets endorsed by a celebrity 4 Celeb1 endorses lehenga, saree, blouse, kurti, sandals and stilettos.
Avg. Number of Celebrities endorsing a fashion synset 15 -
Number of Fashion Synsets endorsed by celebrities 252 -

Table 3: Multimodal Dialog Dataset Statistics
Dataset Statistics Train Valid Test
Number of Dialogs(chat sessions) 105,439 22,595 22,595
Proportion in terms of dialogs 70 15 15
Avg. Number of Utterances per dialog 40 40 40
Number of Utterances with shopper's questions 2M 446K 445K
Number of Utterances with agent's image response 904K 194K 193K
Number of Utterances with agent's text response 1.54M 331K 330K
Avg. Number of Positive images in agent's image response 4 4 4
Avg. Number of Negative images in agent's image response 4 4 4
Avg. Number of Words in shopper's Question 12 12 12
Avg. Number of Words in agent's text Response 14 14 14
Avg. Number of Automata states per dialog 15 15 15
Avg. Number of Dialogs having a particular automata state 20,246 4,346 4,335
Avg. Number of Automata state-types per dialog 13 13 13
Avg. Number of dialogs having a particular state-type 59,638 12,806 12,764
Number of Automata states 84 84 84
Number of Automata state types 22 22 22
Vocabulary Size (with minimum frequency 4) 26,422 - -