policy improvement iteration